python-CSV Multiple Columns with the same header into one column

Question:

I have a CSV file with company data with 22 rows and 6500 columns. The columns have the same names and I should get the columns with the same names stacked into individual columns according to their headers.

I have now the data in one df like this:

Y   C   Y   C   Y   C

1.  a   1.  b.  1.  c.

2.  a.  2.  b.  2.  c.  

and I need to get it like this:

Y   C

1.  a. 

2.  a. 

1.  b.

2.  b.

1.  c.

2.  c. 
Asked By: jjordie56

||

Answers:

I would try an attempt where you slice the df in chunks by iteration and concat them back together, since the column names can’t be identified distinctly.

EDIT
Changed answer to new input:

chunksize = 2
stacked_df = (
    pd.concat(
        [
            df.iloc[:, i:i+chunksize] for i in range(0, len(df.columns), chunksize)
        ]
    )
    .reset_index(drop=True))
print(stacked_df)
   Y  C
0  1  a
1  2  a
2  1  b
3  2  b
4  1  c
5  2  c
Answered By: Rabinzel

I couldn’t resist looking for a solution.
The best I found so far accounts for the fact that pd.read_csv addresses repeated column names by appending ‘.N’ to the duplicates.

In [2]: df = pd.read_csv('duplicate_columns.csv')
In [3]: df
Out[3]:
   1  2  3  4 1.1 2.1 3.1 4.1 1.2 2.2 3.2 4.2
0  a  q  j  e   w   e   r   t   y   u   d   s
1  b  w  w  f   c   e   f   g   d   c   s   a
2  d  q  e  h   c   f   b   f   a   w   q   r

To put your data into the same column…

  1. Group the columns by their original names.
  2. Apply a flattener to convert to a series of arrays.
  3. Create a new data frame from the series viewed as a dict.
In [3]: grouper = lambda l: l.split('.')[0]    # peels off added suffix
In [4]: flattener = lambda v: v.stack().values # reshape groups
In [4]: pd.DataFrame(df.groupby(by=grouper, axis='columns')
   ...:                .apply(flattener)
   ...:                .to_dict())
Out[4]:
   1  2  3  4
0  a  q  j  e
1  w  e  r  t
2  y  u  d  s
3  b  w  w  f
4  c  e  f  g
5  d  c  s  a
6  d  q  e  h
7  c  f  b  f
8  a  w  q  r

I’d love to see a cleaner, less obtuse, general solution.

Answered By: Ray Pelletier
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.