Joining CSV or Tables

Question:

I have two csv fies with different columns.
Table1

title   stage   jan      time
darn    3.001   0.421   5/23/2016 13:14
darn    2.054   0.1213  5/24/2016 14:14
ok      2.829   1.036   5/23/2016 14:14
five    1.115   1.146   5/23/2016 17:14
three      2       5    5/23/2016 21:14

Table 2

title   mar      apr     may    jun      date
darn    0.631   1.321   0.951   1.751   5/23/2016 12:14
ok      1.001   0.247   2.456   0.3216  5/24/2016 18:41
three   0.285   1.283   0.924   956     5/25/2016 17:41

I need to join them filtered by title(primary key) and the condition that the time in date field in table 2 is equal to (time – 1 hour) in date field in table 1. So the output should be something like this:

title   stage   jan     mar     apr     may    jun     date
 darn   3.001   0.421   0.631  1.321   0.951  1.751  5/23/2016 13:14

I was wondering if it can be done using Pandas or SQL query is the best way forward. I looked up and saw that pandas can merge based on unique key value.
import pandas as pd

a = pd.read_csv("1.csv")
b = pd.read_csv("2.csv")
merged = a.merge(b, on='title')
merged.to_csv("output.csv", index=False)

This is the program. I am struggling on how to set the condition for the date field.Bot SQL and Pandas solution is welcome

Asked By: Diganta Bharali

||

Answers:

assuming your time and date variables are recognized as such by Pandas,
just add

merged = merged[merged.date == (merged.time - pd.Timedelta('1 hours'))]
Answered By: ℕʘʘḆḽḘ

I would create a dummy column (to match “time” in df):

In [11]: df1["time"] = df1["date"] + pd.offsets.Hour(1)

Now you can merge cleanly:

In [12]: df.merge(df1)
Out[12]:
  title  stage    jan                time    mar    apr    may    jun                date
0  darn  3.001  0.421 2016-05-23 13:14:00  0.631  1.321  0.951  1.751 2016-05-23 12:14:00

In [13]: df.merge(df1, on=["title", "time"])  # potentially less reckless to specify columns
Out[13]:
  title  stage    jan                time    mar    apr    may    jun                date
0  darn  3.001  0.421 2016-05-23 13:14:00  0.631  1.321  0.951  1.751 2016-05-23 12:14:00

Note: This means you don’t have to do the complete merge (on just title) which potentially could be very space inefficient.

Answered By: Andy Hayden
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.