Pandas – Getting a Key Error when the Key Exists
Question:
I’m trying to join two dataframes in Pandas.
The first frame is called Trades and has these columns:
TRADE DATE
ACCOUNT
COMPANY
COST CENTER
CURRENCY
The second frame is called Company_Mapping and has these columns:
ACTUAL_COMPANY_ID
MAPPED_COMPANY_ID
I’m trying to join them with this code:
trade_df = pd.merge(left=Trades, right = Company_Mapping, how = 'left', left_on = 'COMPANY', right_on = 'ACTUAL_COMPANY_ID'
This returns:
KeyError: 'COMPANY'
I’ve double checked the spelling and COMPANY is clearly in Trades, and I have no clue what would cause this.
Any ideas?
Thanks!
Answers:
Your Trades
dataframe has a single column with all the intended column names mashed together into a single string. Check the code that parses your file.
Essentially keyError is shown in pandas python when there is no such column name for example you are typing df.loc[df['one']==10]
but column name ‘one does not exist’ whoever if it exist and you are still getting the same error try place try and except statement my problem was solved using try and except statement.
for example
try:
df_new = df.loc[df['one']==10]
except KeyError:
print('No KeyError')
Make sure you read your file with the right seperation.
df = pd.read_csv("file.csv", sep=';')
or
df = pd.read_csv("file.csv", sep=',')
Just in case someone have the same problem, sometimes you need to transpose
your dataframe:
import pandas as pd
df = pd.read_csv('file.csv')
# A B C
# -------
# 1 2 3
# 4 5 6
new_df = pd.DataFrame([df['A'], df['B']])
# A | 1 4
# B | 2 5
new_df['A'] # KeyError
new_df = new_df.T
# A B
# ---
# 1 2
# 4 5
new_df['A'] # KeyError
# A
# -
# 1
# 4
I’m trying to join two dataframes in Pandas.
The first frame is called Trades and has these columns:
TRADE DATE
ACCOUNT
COMPANY
COST CENTER
CURRENCY
The second frame is called Company_Mapping and has these columns:
ACTUAL_COMPANY_ID
MAPPED_COMPANY_ID
I’m trying to join them with this code:
trade_df = pd.merge(left=Trades, right = Company_Mapping, how = 'left', left_on = 'COMPANY', right_on = 'ACTUAL_COMPANY_ID'
This returns:
KeyError: 'COMPANY'
I’ve double checked the spelling and COMPANY is clearly in Trades, and I have no clue what would cause this.
Any ideas?
Thanks!
Your Trades
dataframe has a single column with all the intended column names mashed together into a single string. Check the code that parses your file.
Essentially keyError is shown in pandas python when there is no such column name for example you are typing df.loc[df['one']==10]
but column name ‘one does not exist’ whoever if it exist and you are still getting the same error try place try and except statement my problem was solved using try and except statement.
for example
try:
df_new = df.loc[df['one']==10]
except KeyError:
print('No KeyError')
Make sure you read your file with the right seperation.
df = pd.read_csv("file.csv", sep=';')
or
df = pd.read_csv("file.csv", sep=',')
Just in case someone have the same problem, sometimes you need to transpose
your dataframe:
import pandas as pd
df = pd.read_csv('file.csv')
# A B C
# -------
# 1 2 3
# 4 5 6
new_df = pd.DataFrame([df['A'], df['B']])
# A | 1 4
# B | 2 5
new_df['A'] # KeyError
new_df = new_df.T
# A B
# ---
# 1 2
# 4 5
new_df['A'] # KeyError
# A
# -
# 1
# 4