assign higher header level to pandas data frame in the read_csv command

Question:

I am reading multiple csv files into pandas and concatenating them into one frame. Each csv file shape is 8×256. I am using the following code:

path = r'may12_fm'
all_files = glob.glob(os.path.join(path, '*.csv'))
df_may= pd.concat((pd.read_csv(f, header=None) for f in all_files ), ignore_index=True, axis=1,)

I want to keep the name of each file as a higher level header, and also the lower header numbering to be from 0 to 255 for all concatenated frames, something like this:

— m_0 —————– m_1 ————– m_3
— 0 1 2 3 4 …. 255 —- 0, 1, 2, …255—— 0, 1, 2, …..255
0
1
2
3
4

I appreciate helping me in this.
Thanks

Asked By: Talal Ghannam

||

Answers:

Since you know the files names used, and (if) you know the number of columns per file, maybe you could just do it after the files are loaded.

Example frame

   0  1  2  0  1  2
0  1  1  1  1  1  1
1  2  2  2  2  2  2
2  3  3  3  3  3  3
3  4  4  4  4  4  4

What this answer assumes you know

all_files = ['m_0','m_1']
cols_per_file = 3

Then you could build a multi-index

idx = pd.MultiIndex.from_arrays([np.repeat(all_files, cols_per_file), df.columns])
print(idx) 

MultiIndex([('m_0', 0),
            ('m_0', 1),
            ('m_0', 2),
            ('m_1', 0),
            ('m_1', 1),
            ('m_1', 2)],
           )

And then assign back to the frame

df.columns = idx
print(df)

Result

  m_0       m_1      
    0  1  2   0  1  2
0   1  1  1   1  1  1
1   2  2  2   2  2  2
2   3  3  3   3  3  3
3   4  4  4   4  4  4
Answered By: jch
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.