Column name repetition in pandas data frame
Question:
Answers:
You may simply call the values
method after head
:
for i in range(1, len(df_A2C), 1):
A2C_TT= df_A2C.loc[(df_A2C['TO_ID'] == i)].sort_values('DURATION_H').head(1).values
if A2C_TT.size > 0:
print (A2C_TT)
EDIT:
On a side note, I think that you may get away without iterating over your pandas dataframe.
I created an example dataframe to illustrate this:
df = pd.DataFrame()
df['COL_A'] = [1, 2, 3, 4, 5, 1, 6, 6]
df['COL_B'] = [100,200,666,200, 451, 42, 1664, 1665]
Below is your snippet modified for this dataframe:
for i in range(1, len(df), 1):
A2C_TT= df.loc[(df['COL_A'] == i)].sort_values('COL_B').head(1)
if A2C_TT.size > 0:
print (A2C_TT)
I think that you may simply sort by COL_A
(your TO_ID
) and then by COL_B
(your DURATION_H
) and then group by COL_A
, taking the first value in each group:
df.sort_values(['COL_A', 'COL_B'], ascending=[True,True]).groupby('COL_A').first()
Both snippets output the same dataframe:
COL_B
COL_A
1 42
2 200
3 666
4 200
5 451
6 1664
You may simply call the values
method after head
:
for i in range(1, len(df_A2C), 1):
A2C_TT= df_A2C.loc[(df_A2C['TO_ID'] == i)].sort_values('DURATION_H').head(1).values
if A2C_TT.size > 0:
print (A2C_TT)
EDIT:
On a side note, I think that you may get away without iterating over your pandas dataframe.
I created an example dataframe to illustrate this:
df = pd.DataFrame()
df['COL_A'] = [1, 2, 3, 4, 5, 1, 6, 6]
df['COL_B'] = [100,200,666,200, 451, 42, 1664, 1665]
Below is your snippet modified for this dataframe:
for i in range(1, len(df), 1):
A2C_TT= df.loc[(df['COL_A'] == i)].sort_values('COL_B').head(1)
if A2C_TT.size > 0:
print (A2C_TT)
I think that you may simply sort by COL_A
(your TO_ID
) and then by COL_B
(your DURATION_H
) and then group by COL_A
, taking the first value in each group:
df.sort_values(['COL_A', 'COL_B'], ascending=[True,True]).groupby('COL_A').first()
Both snippets output the same dataframe:
COL_B
COL_A
1 42
2 200
3 666
4 200
5 451
6 1664