Concatenate values in a dataframe with value in preceding column on same row – Python

Question:

I am trying to concatenate the values in a cell with values in its preceding cell on the same row i.e. one column before it throughout my dataframe. For sure, the first column values wont have anything to concatenate with. Also, my df has NaN values – which I have changed to None.

enter image description here

Any help would be appreciated.

Thanks in advance.

Asked By: RikkiS

||

Answers:

# Constructing the dataframe:
df = pd.DataFrame({'l0': list('aaab'), 
                   'l1': list('begj'),
                   'l2': list('cfhk'),
                   'l3': ['d', np.nan, 'i', 'l'],
                   'l4': ['e', np.nan, np.nan, 'm']})

I am iterating through the columns one by one, using pandas.Series.str.cat, and replacing them in the original dataframe:

prev = df.iloc[:, 0]

for col in df.columns[1:]:
    prev = prev.str.cat(df[col], sep='_')
    df[col] = prev
Answered By: Vladimir Fokow

Try with add then cumsum

out = df.add('_').apply(lambda x : x[x.notna()].cumsum().str[:-1],axis=1)
Out[871]: 
   1    2      3        4          5
0  a  a_b  a_b_c  a_b_c_d  a_b_c_d_e
1  a  a_e  a_e_f      NaN        NaN
Answered By: BENY

Using a simple loop to keep vectorial efficiency:

df2 = df.copy()
for i in range(1, df.shape[1]):
    df2.iloc[:, i] = df2.iloc[:, i-1]+'_'+df2.iloc[:, i]

output:

  l0   l1     l2       l3         l4
0  a  a_b  a_b_c  a_b_c_d  a_b_c_d_e
1  a  a_e  a_e_f      NaN        NaN
2  a  a_g  a_g_h  a_g_h_i        NaN
3  b  b_j  b_j_k  b_j_k_l  b_j_k_l_m
Answered By: mozway