Pandas – Getting a Key Error when the Key Exists

Question

I’m trying to join two dataframes in Pandas.

The first frame is called Trades and has these columns:

TRADE DATE
ACCOUNT
COMPANY
COST CENTER
CURRENCY

The second frame is called Company_Mapping and has these columns:

ACTUAL_COMPANY_ID
MAPPED_COMPANY_ID

I’m trying to join them with this code:

trade_df = pd.merge(left=Trades, right = Company_Mapping, how = 'left', left_on = 'COMPANY', right_on = 'ACTUAL_COMPANY_ID'

This returns:

KeyError: 'COMPANY'

I’ve double checked the spelling and COMPANY is clearly in Trades, and I have no clue what would cause this.

Any ideas?

Thanks!

Asked By: DixieFlatline

||

Source

Answer 1

Your Trades dataframe has a single column with all the intended column names mashed together into a single string. Check the code that parses your file.

Answered By: piRSquared

Answer 2

Essentially keyError is shown in pandas python when there is no such column name for example you are typing df.loc[df['one']==10] but column name ‘one does not exist’ whoever if it exist and you are still getting the same error try place try and except statement my problem was solved using try and except statement.

for example

try:
   df_new = df.loc[df['one']==10]
except KeyError:
   print('No KeyError')

Answered By: Azum

Answer 3

Make sure you read your file with the right seperation.

df = pd.read_csv("file.csv", sep=';')

or

df = pd.read_csv("file.csv", sep=',')

Answered By: Daniëlle

Answer 4

Just in case someone have the same problem, sometimes you need to transpose your dataframe:

    import pandas as pd

    df = pd.read_csv('file.csv')
    # A  B  C
    # -------
    # 1  2  3
    # 4  5  6

    new_df = pd.DataFrame([df['A'], df['B']])
    # A | 1 4
    # B | 2 5
    
    new_df['A'] # KeyError
    
    new_df = new_df.T
    # A B
    # --- 
    # 1 2
    # 4 5

    new_df['A'] # KeyError
    # A
    # - 
    # 1
    # 4

Answered By: RomuloPBenedetti

Pandas – Getting a Key Error when the Key Exists

Question:

Answers: