Python Pandas Merge/Join

Question:

This is a sample script for some basic testing that later should be implemented in PowerQuery/Python.
I’m not a Python developer so please be gentle.

    import pandas as pd

data = [['Sql "syn-eu2-prd-edw-001 database windows net", "syndpeu2prdedw1",  Query="SELECT ta FROM rdv_60_137.Account"', "x1"], ["b1", "x2"]]
df = pd.DataFrame(data, columns=["Expression","Key"])

ser = df['Expression']
serpd = pd.Series(ser)
print (serpd)

x = serpd.str.extractall(r'"(.*?)"')
print (x)

z = pd.merge(df, x, how="inner", left_index=True, right_index=True)
print (z)

The goal would be to have a new DataFrame where I have all records from the first set, but the 1st record would have 3 instances as the Regex function created 3 rows.

Kr, Harry

Asked By: Harry Leboeuf

||

Answers:

Remove second level of MultiIndex by MultiIndex.droplevel and add original rows by DataFrame.join:

z = (df.join(df['Expression'].str.extractall(r'"(.*?)"')[0].droplevel(1).rename('New'))
       .reset_index(drop=True))
print (z)
                                          Expression Key  
0  Sql "syn-eu2-prd-edw-001 database windows net"...  x1   
1  Sql "syn-eu2-prd-edw-001 database windows net"...  x1   
2  Sql "syn-eu2-prd-edw-001 database windows net"...  x1   
3                                                 b1  x2   

                                        New  
0  syn-eu2-prd-edw-001 database windows net  
1                           syndpeu2prdedw1  
2         SELECT ta FROM rdv_60_137.Account  
3                                       NaN  
Answered By: jezrael