How to split python dataframe rows in one to one mapping form?
Question:
I’m trying to split python dataframe rows to one to one mapping format such that each new line represents the value in the same order as the column in front and if there is a single value on other columns copy their value in the new created row, like:
A
B
C
valueA1
valueB1
valueB2
valueB3
valueC1
valueC2
valueC3
valueA2
valueB4
valueC4
Should split like:
A
B
C
valueA1
valueB1
valueC1
valueA1
valueB2
valueC2
valueA1
valueB3
valueC3
valueA2
valueB4
valueC4
Here all values are string type.
I did tried explode()
but that was not splitting in one to one format, any help is really appreciated!
Answers:
You need to turn your columns to lists by splitting the lines and then explode them
In [28]: df
Out[28]:
A B C
0 valueA1 ValueB1nValueB2nValueB3 ValueC1nValueC2nValueC3
1 valueA2 ValueB4 ValueC4
In [29]: df.apply(lambda col: col.str.split()).explode("A").explode(["B", "C"])
Out[29]:
A B C
0 valueA1 ValueB1 ValueC1
0 valueA1 ValueB2 ValueC2
0 valueA1 ValueB3 ValueC3
1 valueA2 ValueB4 ValueC4
This solution only focuses on columns B
and C
:
df1[['B', 'C']] = df1[['B', 'C']].apply(lambda x: x.str.split('n'))
df1 = df1.explode(list('BC'))
df1
A B C
0 valueA1 valueB1 valueC1
0 valueA1 valueB2 valueC2
0 valueA1 valueB3 valueC3
1 valueA2 valueB4 valueC4
I’m trying to split python dataframe rows to one to one mapping format such that each new line represents the value in the same order as the column in front and if there is a single value on other columns copy their value in the new created row, like:
A | B | C |
---|---|---|
valueA1 | valueB1 valueB2 valueB3 |
valueC1 valueC2 valueC3 |
valueA2 | valueB4 | valueC4 |
Should split like:
A | B | C |
---|---|---|
valueA1 | valueB1 | valueC1 |
valueA1 | valueB2 | valueC2 |
valueA1 | valueB3 | valueC3 |
valueA2 | valueB4 | valueC4 |
Here all values are string type.
I did tried explode()
but that was not splitting in one to one format, any help is really appreciated!
You need to turn your columns to lists by splitting the lines and then explode them
In [28]: df
Out[28]:
A B C
0 valueA1 ValueB1nValueB2nValueB3 ValueC1nValueC2nValueC3
1 valueA2 ValueB4 ValueC4
In [29]: df.apply(lambda col: col.str.split()).explode("A").explode(["B", "C"])
Out[29]:
A B C
0 valueA1 ValueB1 ValueC1
0 valueA1 ValueB2 ValueC2
0 valueA1 ValueB3 ValueC3
1 valueA2 ValueB4 ValueC4
This solution only focuses on columns B
and C
:
df1[['B', 'C']] = df1[['B', 'C']].apply(lambda x: x.str.split('n'))
df1 = df1.explode(list('BC'))
df1
A B C
0 valueA1 valueB1 valueC1
0 valueA1 valueB2 valueC2
0 valueA1 valueB3 valueC3
1 valueA2 valueB4 valueC4