Concatenating columns in pandas dataframe

Question:

I have a dataset like below (see code):

import pandas as pd

data = {'id':  ['001', '002', '003','004'],
        'address': ["William J. Clare\n290 Valley Dr.\nCasper, WY 82604\nUSA",
                    "1180 Shelard Tower\nMinneapolis, MN 55426\nUSA",
                    "William N. Barnard\n145 S. Durbin\nCasper, WY 82601\nUSA",
                    "215 S 11th ST"],
        'locality': [None, None, None,'Laramie'],
        'region': [None, None, None, 'WY'],
        'Zipcode': [None, None, None, '87656'],
        'Country': [None, None, None, 'US']
        }

df = pd.DataFrame(data)

As you can see 4th line in address doesn’t have locality,region, zipcode, country but it is there in different column.

I am trying to work with if statement. I want to write an if condition for the dataframe telling if df[locality,region,zipcode, country] not None then concatenate locality, region,zipcode, country into address column with '\n' seperator

sample output:

address
290 Valley Dr.\nCasper, WY 82604\nUSA
1180 Shelard Tower\nMinneapolis, MN 55426\nUSA
145 S. Durbin\nCasper, WY 82601\nUSA
215 S 11th ST\nLaramie, WY 87656\nUS

I have been trying this from yesterday since I am not from a coding back ground any help will be appreciated greatly.

Thanks

Asked By: Sushmitha Krishnan

||

Answers:

The following will do the work

df = df['address'].where(df[['locality', 'region', 'Zipcode', 'Country']].isnull().all(axis=1), df['address'] + '\n' + df['locality'] + ', ' + df['region'] + ' ' + df['Zipcode'] + '\n' + df['Country'])

[Out]:

0    William J. Claren290 Valley Dr.nCasper, WY 8...
1       1180 Shelard TowernMinneapolis, MN 55426nUSA
2    William N. Barnardn145 S. DurbinnCasper, WY ...
3                 215 S 11th STnLaramie, WY 87656nUS

Notes:

  • I’ve adjusted the separator to be more close to the sample output in OP’s question. If needed, one can change the ', ' or ' ' with \n.
Answered By: Gonçalo Peres