How to do column string concatenation including space separator in Pandas dataframe?

Question:

I am a Pandas DataFrame as follows:

df = pd.DataFrame({
    'id': [1,2 ,3],
    'txt1': ['Hello there1', 'Hello there2', 'Hello there3'],
    'txt2': ['Hello there4', 'Hello there5', 'Hello there6'],
    'txt3': ['Hello there7', 'Hello there8', 'Hello there9']
})
df

id  txt1            txt2            txt3
1   Hello   there1  Hello there4    Hello there7
2   Hello   there2  Hello there5    Hello there8
3   Hello   there3  Hello there6    Hello there9

I want to concatenate column txt1, txt2, and txt3. So far I am able to achieve it as follows:

df['alltext'] = df['txt1']  + df['txt2'] + df['txt3']
df

id  txt1            txt2            txt3            alltext
1   Hello there1    Hello there4    Hello there7    Hello there1Hello there4Hello there7
2   Hello there2    Hello there5    Hello there8    Hello there2Hello there5Hello there8
3   Hello there3    Hello there6    Hello there9    Hello there3Hello there6Hello there9

but how to introduce space character between the two column strings while concatenating in Pandas?

I have just started learning Pandas.

Asked By: user3243499

||

Answers:

You can also add separator between columns:

df['alltext'] = df['txt1']  + ' ' + df['txt2'] + ' ' + df['txt3']

Or filter by DataFrame.filter only columns with txt in column name and use join per rows with apply:

df['alltext'] = df.filter(like='txt').apply(' '.join, axis=1)

Or filter only object columns by DataFrame.select_dtypes – most times a Series with a dtype of object is going to be a string – but it could be any Python object:

df['alltext'] = df.select_dtypes('object').apply(' '.join, axis=1)

Or select columns by positions – all columns without first by DataFrame.iloc:

df['alltext'] = df.iloc[:, 1:].apply(' '.join, axis=1)

Thank you, @Jon Clements for solution for better matching columns names with txt and numeric:

df['alltext'] = df.filter(regex=r'^txtd+$').apply(' '.join, axis=1)
Answered By: jezrael

Simply add space between that,

df['alltext'] = df['txt1']  + ' ' + df['txt2'] + ' ' + df['txt3']
Answered By: shaik moeed
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.