Create a new dataframe based on columns name

Question:

I hope you can give me a hand here.
So, I have this dataframe:

|   |    a |   b  |  a.1 |  b.1 |  a.2 |   b.2|
|--:|-----:|-----:|-----:|-----:|-----:|------|
| 0 | a111 | b111 | c222 | d222 | e333 | f333 |

Make a subset is not an option because is too many columns.

The output I expect is like this:

|   | a    | b    |
|---|------|------|
| 0 | a111 | b111 |
| 1 | c222 | d222 |
| 2 | e333 | f333 |

Thanks in advance.

The code to replicate the dataframe:

list_demo = []

a = "a111"
b = "b111"
c = "c222"
d = 'd222'
e = 'e333'
f = "f333"

list_demo.append([a,b,c,d,e,f])
df = pd.DataFrame(list_demo)
df.columns = ['a', 'b', 'a.1', 'b.1', 'a.2', 'b.2']

Answers:

You can use:

out = (df
 .set_axis(df.columns.str.split('.', expand=True), axis=1)
 .stack()
 .droplevel(1)
 )

Output:

      a     b
0  a111  b111
0  c222  d222
0  e333  f333

Or, if you have a single row:

(df.set_axis(df.columns.str.split('.', expand=True), axis=1)
   .stack()
   .reset_index(drop=True)
 )

Output:

      a     b
0  a111  b111
1  c222  d222
2  e333  f333
Answered By: mozway

One option is with pivot_longer from pyjanitor:

# pip install pyjanitor
import pandas as pd
import janitor

df.pivot_longer(names_to = '.value', names_pattern = '(.).*')

      a     b
0  a111  b111
1  c222  d222
2  e333  f333

The .value is a placeholder, which determines what sub parts of the column labels remain as header. The regex group in names_pattern decides which parts are extracted.

Answered By: sammywemmy