Remove space between string after comma in python dataframe column

Question:

df1

ID                       Col
1       new york, london school of economics, america
2       california & washington,  harvard university, america

Expected output is :

df1

ID                       Col
1       new york,london school of economics,america
2       california & washington,harvard university,america

My try is :

df1[Col].apply(lambda x : x.str.replace(", ","", regex=True))
Asked By: AB14

||

Answers:

If you mention the axis it will be solved

df.apply(lambda x: x.str.replace(', ',',',regex=True),axis=1)
Answered By: Rajib Lochan Sarkar

You can use str.replace(', ', ",") instead of a lambda function. However, this will only work if there is only one space after ",".

As Алексей Р mentioned, (r',s+', ",", regex=True) is needed to catch any extra spaces after ",".

Reference: https://pandas.pydata.org/docs/reference/api/pandas.Series.str.replace.html

Example:

import pandas as pd

data_ = ['new york, london school of economics, america', 'california & washington,  harvard university, america']

df1 = pd.DataFrame(data_)
df1.columns = ['Col']
df1.index.name = 'ID'
df1.index = df1.index + 1

df1['Col'] = df1['Col'].str.replace(r',s+', ",", regex=True)

print(df1)

Result:

                                                  Col
ID                                                   
1         new york,london school of economics,america
2   california & washington,harvard university,ame...
Answered By: コリン

You can split the string on ',' and then remove the extra whitespaces and join the list.

df1=df1['Col'].apply(lambda x : ",".join([w.strip() for w in x.split(',')]))

Hope this helps.

Answered By: Nupur Gopali

It is advisable to use the regular expression ,s+, which allows you to capture several consecutive whitespace characters after a comma, as in washington, harvard

df = pd.DataFrame({'ID': [1, 2], 'Col': ['new york,           london school of economics,  america',
                                         'california & washington,  harvard university, america']}).set_index('ID')
df.Col = df.Col.str.replace(r',s+', ',', regex=True)
print(df)
                                                  Col
ID                                                   
1         new york,london school of economics,america
2   california & washington,harvard university,ame...
Answered By: Алексей Р
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.