Column name repetition in pandas data frame

Question

I am trying to read some data but the column names are repeatedly coming.

Here is a sample code:

for i in range(1, len(df_A2C), 1):
    A2C_TT= df_A2C.loc[(df_A2C['TO_ID'] == i)].sort_values('DURATION_H').head(1)
    if A2C_TT.size > 0:
        print (A2C_TT)

Output:

I do not need column names. What should I do?

Asked By: LearningLogic

||

Source

Answer 1

You may simply call the values method after head:

for i in range(1, len(df_A2C), 1):
    A2C_TT= df_A2C.loc[(df_A2C['TO_ID'] == i)].sort_values('DURATION_H').head(1).values
    if A2C_TT.size > 0:
        print (A2C_TT)

EDIT:

On a side note, I think that you may get away without iterating over your pandas dataframe.

I created an example dataframe to illustrate this:

df = pd.DataFrame()
df['COL_A'] = [1, 2, 3, 4, 5, 1, 6, 6]
df['COL_B'] = [100,200,666,200, 451, 42, 1664, 1665]

Below is your snippet modified for this dataframe:

for i in range(1, len(df), 1):
    A2C_TT= df.loc[(df['COL_A'] == i)].sort_values('COL_B').head(1)
    if A2C_TT.size > 0:
        print (A2C_TT)

I think that you may simply sort by COL_A (your TO_ID) and then by COL_B (your DURATION_H) and then group by COL_A, taking the first value in each group:

df.sort_values(['COL_A', 'COL_B'], ascending=[True,True]).groupby('COL_A').first()

Both snippets output the same dataframe:

Answered By: Sheldon

Column name repetition in pandas data frame

Question:

Answers: