How to split some text to another column from pandas dataframe

Question:

Let’s say I have below dataframe

          ID                      EQ Type
0      P-456  BSD500,BSD300,GHF20e,GHF20g
1      P-460         BSD500,BSD300,GHF20e
2      P-462         GHF20e,GHF20g,GHF20h

I want to take some text of EQ Type’s row then insert it to another column, the expected result is as below

          ID             EQ Type               EQ Type_1
0      P-456       BSD500,BSD300           GHF20e,GHF20g
1      P-460       BSD500,BSD300                  GHF20e
2      P-462                None    GHF20e,GHF20g,GHF20h

Is it possible to do that? Because each row has a different amount of EQ

I have tried this

df[["EQ Type", "EQ Type 1"]] = df["EQ Type"].str.split("GHF20",expand=True)

but it doesn’t work.

Thanks.

Asked By: Algo Rithm

||

Answers:

I hope I’ve understood you correctly, you want to extract every GHF20* value into a separate column:

df = df.assign(tmp=df["EQ Type"].str.split(",")).explode("tmp")
df[""] = np.where(df["tmp"].str.startswith("GHF20"), "EQ Type_1", "EQ Type")

print(
    df.pivot_table(
        index=[df.index, "ID"],
        columns="",
        values="tmp",
        aggfunc=",".join,
    )
    .droplevel(0)
    .reset_index()
)

Prints:

      ID        EQ Type             EQ Type_1
0  P-456  BSD500,BSD300         GHF20e,GHF20g
1  P-460  BSD500,BSD300                GHF20e
2  P-462            NaN  GHF20e,GHF20g,GHF20h
Answered By: Andrej Kesely
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.