How do I take data from one csv file and find the results in another using python?
Question:
I have two csv files with data in both of them. The first one has columns of data like the following:

And the second csv file looks like this

What I want to do is able to group these together based on the values in column D. I want to create a new column next to D that matches its value to the detail in the second file. The output would look something like:

File number 1 is always changing numbers and there are hundreds of numbers in each file so I don’t want to write them manually in a list for the code.
Answers:
Use Pandas
:
import pandas as pd
df1 = pd.read_csv('first_file.csv')
df2 = pd.read_csv('second_file.csv')
df1['Details'] = df1['D'].map(df2.set_index('Number')['Details'])
# or
df1 = df1.merge(df2, left_on='D', right_on='Number', how='left')
df1.to_excel('merge_file.csv', index=False)
Update: the same version with csv
module:
import csv
with (open('first_file.csv') as fp1,
open('second_file.csv') as fp2,
open('merge_file.csv', 'w') as out):
df1 = csv.DictReader(fp1)
df2 = csv.DictReader(fp2)
mapping = {row['Number']: row['Details'] for row in df2}
dfm = csv.DictWriter(out, df1.fieldnames + ['Details'])
dfm.writeheader()
for row in df1:
dfm.writerow(dict(row, Details=mapping.get(row['D'], '')))
Output:
A
B
C
D
Details
15
22
13
41
comma
52
36
67
87
carrer
91
150
41
12
recording
123
14
76
16
mold
I have two csv files with data in both of them. The first one has columns of data like the following:
And the second csv file looks like this
What I want to do is able to group these together based on the values in column D. I want to create a new column next to D that matches its value to the detail in the second file. The output would look something like:
File number 1 is always changing numbers and there are hundreds of numbers in each file so I don’t want to write them manually in a list for the code.
Use Pandas
:
import pandas as pd
df1 = pd.read_csv('first_file.csv')
df2 = pd.read_csv('second_file.csv')
df1['Details'] = df1['D'].map(df2.set_index('Number')['Details'])
# or
df1 = df1.merge(df2, left_on='D', right_on='Number', how='left')
df1.to_excel('merge_file.csv', index=False)
Update: the same version with csv
module:
import csv
with (open('first_file.csv') as fp1,
open('second_file.csv') as fp2,
open('merge_file.csv', 'w') as out):
df1 = csv.DictReader(fp1)
df2 = csv.DictReader(fp2)
mapping = {row['Number']: row['Details'] for row in df2}
dfm = csv.DictWriter(out, df1.fieldnames + ['Details'])
dfm.writeheader()
for row in df1:
dfm.writerow(dict(row, Details=mapping.get(row['D'], '')))
Output:
A | B | C | D | Details |
---|---|---|---|---|
15 | 22 | 13 | 41 | comma |
52 | 36 | 67 | 87 | carrer |
91 | 150 | 41 | 12 | recording |
123 | 14 | 76 | 16 | mold |