Pandas inner Join does not give complete results

Question:

I have two csvs

I am trying to join the two dataframes where the Stocknumber matches, but it is only returning 77 results except the expected 140

here is my code

import pandas as pd
df=pd.read_csv('C:/Users/Starshine/Desktop/vdp.csv')
df = df.iloc[: , :-1]
df['StockNumber']=df['URL'].str[-8:]
df['StockNumber']=df['StockNumber'].str.strip('/')
df['StockNumber']=df['StockNumber'].str.strip('-')
df.to_csv('C:/Users/Starshine/Desktop/panda.csv',index=False)

dfs=pd.read_csv('C:/Users/Starshine/Desktop/a2.csv')
dfs.rename(columns={'Stock #': 'StockNumber'}, inplace=True)
dfs = dfs.iloc[: , :-2]
dfs['Stock']=df['StockNumber']
sf=pd.merge(dfs,df,on='StockNumber')
sf.to_csv('C:/Users/Starshine/Desktop/test21.csv',index=False)
print (sf)

What am I doing wrong here?

Asked By: Cesar

||

Answers:

pandas.merge is case sensitive. You have to lowercase both columns before the merge.

Try this :

import pandas as pd

df=pd.read_csv('C:/Users/Starshine/Desktop/vdp.csv')
df['StockNumber']=df['URL'].str.rsplit('-').str[-1].str.strip('/').str.lower()

dfs=pd.read_csv('C:/Users/Starshine/Desktop/a2.csv')
dfs.rename(columns={'Stock #': 'StockNumber'}, inplace=True)
dfs['StockNumber'] = df['StockNumber'].str.lower()

sf=pd.merge(dfs,df,on='StockNumber')
>>> Result (There is exaclty 139 matches, not 140)
print(sf)
                                      Vehicle StockNumber  
0                          2012 Ford Fusion S      a26131   
1                2020 Chevrolet Malibu LS 1FL      b98795   
2                   2010 Hyundai Santa Fe GLS      571849   
3                       2019 Dodge Charger GT      c32026   
4                        2019 Toyota Camry SE      500754   
..                                        ...         ...   
134          2014 Hyundai Santa Fe Sport 2.4L      656191   
135      2015 Jeep Wrangler Unlimited Rubicon      206164   
136  2012 Mercedes-Benz E-Class E 350 4MATIC?      545815   
137                 2013 Lexus GX 460 Premium      c60862   
138                  2007 Ford F-450SD XL DRW      c42901   

                                                   URL  Images  
0    www.site.com/2007-ford-f450-super-duty-crew-ca...       0  
1    www.site.com/2020-ford-f150-supercrew-cab-lari...       0  
2    www.site.com/2012-mercedes-benz-e-class-e-350-...       0  
3    www.site.com/2014-hyundai-santa-fe-sport-sport...       0  
4    www.site.com/2013-nissan-rogue-sv-sport-utilit...       0  
..                                                 ...     ...  
134  www.site.com/2015-nissan-rogue-select-s-sport-...       0  
135    www.site.com/2016-chevrolet-ss-sedan-4d-206164/       0  
136  www.site.com/2018-volkswagen-atlas-se-sport-ut...      41  
137  www.site.com/2014-lexus-rx-rx-350-sport-utilit...       0  
138  www.site.com/2017-ford-f150-supercrew-cab-xlt-...       0  

[139 rows x 4 columns]
Answered By: L'Artiste
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.