Pandas: Multiple columns into one column

Question:

I have the following data (2 columns, 4 rows):

Column 1: A, B, C, D

Column 2: E, F, G, H

I am attempting to combine the columns into one column to look like this (1 column, 8 rows):

Column 3: A, B, C, D, E, F, G, H

I am using pandas DataFrame and have tried using different functions with no success (append, concat, etc.). Any help would be most appreciated!

Asked By: user2929063

||

Answers:

Update

pandas has a built in method for this stack which does what you want see the other answer.

This was my first answer before I knew about stack many years ago:

In [227]:

df = pd.DataFrame({'Column 1':['A', 'B', 'C', 'D'],'Column 2':['E', 'F', 'G', 'H']})
df
Out[227]:
  Column 1 Column 2
0        A        E
1        B        F
2        C        G
3        D        H

[4 rows x 2 columns]

In [228]:

df['Column 1'].append(df['Column 2']).reset_index(drop=True)
Out[228]:
0    A
1    B
2    C
3    D
4    E
5    F
6    G
7    H
dtype: object
Answered By: EdChum

What you appear to be asking is simply for help on creating another view of your data. If there is no reason those data are in two columns in the first place then just create one column. If however you need to combine them for presentation in some other tool you can do something like:

import itertools as it, pandas as pd
df = pd.DataFrame({1:['a','b','c','d'],2:['e','f','g','h']})
sorted(it.chain(*df.values))
# -> ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Answered By: mechanical_meat

You can flatten the values in column direction using ravel, is much faster.

In [1238]: df
Out[1238]:
  Column 1 Column 2
0        A        E
1        B        F
2        C        G
3        D        H

In [1239]: pd.Series(df.values.ravel('F'))
Out[1239]:
0    A
1    B
2    C
3    D
4    E
5    F
6    G
7    H
dtype: object

Details

Medium

In [1245]: df.shape
Out[1245]: (4000, 2)

In [1246]: %timeit pd.Series(df.values.ravel('F'))
10000 loops, best of 3: 86.2 µs per loop

In [1247]: %timeit df['Column 1'].append(df['Column 2']).reset_index(drop=True)
1000 loops, best of 3: 816 µs per loop

Large

In [1249]: df.shape
Out[1249]: (40000, 2)

In [1250]: %timeit pd.Series(df.values.ravel('F'))
10000 loops, best of 3: 87.5 µs per loop

In [1251]: %timeit df['Column 1'].append(df['Column 2']).reset_index(drop=True)
100 loops, best of 3: 1.72 ms per loop
Answered By: Zero

The trick is to use stack()

df.stack().reset_index()
    
   level_0   level_1  0
0        0  Column 1  A
1        0  Column 2  E
2        1  Column 1  B
3        1  Column 2  F
4        2  Column 1  C
5        2  Column 2  G
6        3  Column 1  D
7        3  Column 2  H
Answered By: Nickpick
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.