Why does df.loc not seem to work in a loop (key error)
Question:
Can anyone tell me why df.loc can’t seem to work in a loop like so
example_data = {
'ID': [1,2,3,4,5,6],
'score': [10,20,30,40,50,60]
}
example_data_df = pd.DataFrame(example_data)
for row in example_data_df:
print(example_data_df.loc[row,'ID'])
and is raising the error "KeyError: ‘ID’"?
Outside of the loop, this works fine:
row = 1
print(example_data_df.loc[row,'ID']
I have been trying different version of this such as example_data_df[‘ID’].loc[row] and tried to see if the problem is with the type of object that is in the columns, but nothing worked.
Thank you in advance!
EDIT: If it plays a role, here is why I think I need to use the loop: I have two dataframes A and B, and need to append certain columns from B to A – however only for those rows where A and B have a matching value in a particular column. B is longer than A, not all rows in A are contained in B. I don’t know how this would be possible without looping, that would be another question I might ask separately
Answers:
If you check ‘row’ as each step, you’ll notice that iterating directly over a DataFrame yields the column names.
You want:
for idx, row in example_data_df.iterrows():
print(example_data_df.loc[idx,'ID'])
Or, better:
for idx, row in example_data_df.iterrows():
print(row['ID'])
Now, I don’t know why you want to iterate manually over the rows, but know that this should be limited to small datasets as it’s the least efficient method of working with a DataFrame.
Can anyone tell me why df.loc can’t seem to work in a loop like so
example_data = {
'ID': [1,2,3,4,5,6],
'score': [10,20,30,40,50,60]
}
example_data_df = pd.DataFrame(example_data)
for row in example_data_df:
print(example_data_df.loc[row,'ID'])
and is raising the error "KeyError: ‘ID’"?
Outside of the loop, this works fine:
row = 1
print(example_data_df.loc[row,'ID']
I have been trying different version of this such as example_data_df[‘ID’].loc[row] and tried to see if the problem is with the type of object that is in the columns, but nothing worked.
Thank you in advance!
EDIT: If it plays a role, here is why I think I need to use the loop: I have two dataframes A and B, and need to append certain columns from B to A – however only for those rows where A and B have a matching value in a particular column. B is longer than A, not all rows in A are contained in B. I don’t know how this would be possible without looping, that would be another question I might ask separately
If you check ‘row’ as each step, you’ll notice that iterating directly over a DataFrame yields the column names.
You want:
for idx, row in example_data_df.iterrows():
print(example_data_df.loc[idx,'ID'])
Or, better:
for idx, row in example_data_df.iterrows():
print(row['ID'])
Now, I don’t know why you want to iterate manually over the rows, but know that this should be limited to small datasets as it’s the least efficient method of working with a DataFrame.