Why does df.loc not seem to work in a loop (key error)

Question:

Can anyone tell me why df.loc can’t seem to work in a loop like so


example_data = {
    'ID': [1,2,3,4,5,6],
    'score': [10,20,30,40,50,60]
}
example_data_df = pd.DataFrame(example_data)

for row in example_data_df:
    print(example_data_df.loc[row,'ID'])

and is raising the error "KeyError: ‘ID’"?

Outside of the loop, this works fine:

row = 1
print(example_data_df.loc[row,'ID']

I have been trying different version of this such as example_data_df[‘ID’].loc[row] and tried to see if the problem is with the type of object that is in the columns, but nothing worked.

Thank you in advance!

EDIT: If it plays a role, here is why I think I need to use the loop: I have two dataframes A and B, and need to append certain columns from B to A – however only for those rows where A and B have a matching value in a particular column. B is longer than A, not all rows in A are contained in B. I don’t know how this would be possible without looping, that would be another question I might ask separately

Asked By: user20501139

||

Answers:

If you check ‘row’ as each step, you’ll notice that iterating directly over a DataFrame yields the column names.

You want:

for idx, row in example_data_df.iterrows():
    print(example_data_df.loc[idx,'ID'])

Or, better:

for idx, row in example_data_df.iterrows():
    print(row['ID'])

Now, I don’t know why you want to iterate manually over the rows, but know that this should be limited to small datasets as it’s the least efficient method of working with a DataFrame.

Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.