Iterate over pairs of columns and add value from one under condition in new column Python

Question

I have dataframe with steps/action in user behaviour. Sample is provided. There are many steps. Each step contains two columns: subtitle and dimension.

df = pd.DataFrame({'idVisit': [1, 2, 3],
                   'subtitle (step 0)': ['download', 'homepage', 'www.example.com'],
                   'dimension1 (step 0)': ['client', nan, 'internal'],
                   'subtitle (step 1)': ['pageview', 'pageview', 'map'],
                   'dimension1 (step 1)': ['client', 'client', nan],
                   'subtitle (step 2)': ['download', 'homepage', 'www.example.com'],
                   'dimension1 (step 2)': ['client', nan, 'internal'],
                   'subtitle (step 3)': ['pageview', 'pageview', 'map'],
                   'dimension1 (step 3)': ['client', 'client', nan]}

I need to merge columns subtitle and dimension for each step new column – if dimension is empty then keep only subtitle, if not keep only dimension.

So new column step0 value: if df[‘dimension1 (step0)’] not null value then use df[‘dimension1 (step0)]
if df[‘dimension 1 (step0)] is null then use df[‘subtitle (step0)’]
then repeated for step1.

I am complete newbie.

Expected output:

[In]: df['step0'] 

[Out]: ['client', 'homepage', 'internal']

[In]: df['step1'] 

[Out]: ['client', 'client', 'map']

# etc.

Asked By: Beginner in the house

||

Source

Answer 1

Assume idVisit is the index. Then you may try .combine_first() method on every odd column (dimension) with every even one (subtitle):

# set the index just in case
df.set_index('idVisit', inplace=True)
# loop over subtitles and dimensions zipped together and enumerated
for n, (subtitle, dimension) in enumerate(zip(df.columns[0::2], df.columns[1::2])):
    df[f'step {n}'] = df[dimension].combine_first(df[subtitle])
# show only added columns
df.iloc[:, 8:]

Output:

# only the added columns are shown 
          step 0    step 1   step 2     step 3
idVisit             
1         client    client   client     client
2         homepage  client   homepage   client
3         internal  map      internal   map

Answered By: Nikita Shabankin

Iterate over pairs of columns and add value from one under condition in new column Python

Question:

Answers: