Create a list of tuples from pandas DataFrame values

Question:

I am looking to generate a list of tuples from my Dataframes. Here is my dataframe

data.csv

,Date,Open,High,Low,Close,min,max
2022-10-03 12:00:00+01:00,19268.458333333332,141.95199584960938,141.97999572753906,141.30999755859375,141.42999267578125,141.42999267578125,
2022-10-04 16:00:00+01:00,19269.625,143.83799743652344,144.07699584960938,143.72999572753906,143.99000549316406,,143.99000549316406
2022-10-05 15:00:00+01:00,19270.583333333332,142.83299255371094,142.87100219726562,142.4199981689453,142.66000366210938,142.66000366210938,
2022-10-06 06:00:00+01:00,19271.208333333332,143.36000061035156,143.43600463867188,143.24000549316406,143.4010009765625,,143.4010009765625
2022-10-07 13:00:00+01:00,19272.5,141.85899353027344,142.1219940185547,141.17999267578125,141.45599365234375,141.45599365234375,

I want to extract ('Date', 'Close') of each row
like this ('2022-10-03', 141.42999267578125) and create a tuples list from those tuples.

I manually created the list of tuples to show what exactly I am looking for

tuples_list = [
        ('2022-10-03', 141.42999267578125), ('2022-10-04', 143.99000549316406), # row[0-1]
        ('2022-10-04', 143.99000549316406), ('2022-10-05', 142.66000366210938), # row[1-2]
        ('2022-10-05', 142.66000366210938), ('2022-10-06', 143.4010009765625),  # row[2-3]
        ('2022-10-06', 143.4010009765625), ('2022-10-07', 141.45599365234375),  # row[3-4]
    ]
Asked By: tiberhockey

||

Answers:

The line below gives the desired list of tuples assuming that df is your pandas dataframe:

list_tuples = list(df[['Date', 'Close']].to_records(index=True))

Edit: Edited answer so that the result is exactly the tuples you want.

One approach could be as follows:

df.index = pd.to_datetime(df.index).date.astype(str)

s = pd.concat([df.Close]*2).sort_index()
tuples_list = list(zip(s.index, s))[1:-1]

print(tuples_list)

[('2022-10-03', 141.42999267578125),('2022-10-04', 143.99000549316406),
 ('2022-10-04', 143.99000549316406),('2022-10-05', 142.66000366210938),
 ('2022-10-05', 142.66000366210938),('2022-10-06', 143.4010009765625),
 ('2022-10-06', 143.4010009765625),('2022-10-07', 141.45599365234375)]
Answered By: ouroboros1

With such simple data, and a non-pandas desired output, using pandas may be overkill.

import csv

with open('data.csv') as f:
    file = csv.reader(f)
    header = next(file)
    tuples_list = [(x[0][:10], float(x[5])) for x in file]

print(tuples_list)

Output:

[('2022-10-03', 141.42999267578125),
 ('2022-10-04', 143.99000549316406),
 ('2022-10-05', 142.66000366210938),
 ('2022-10-06', 143.4010009765625),
 ('2022-10-07', 141.45599365234375)]

from itertools import pairwise, chain

tuples_list = list(chain.from_iterable(pairwise(tuples_list)))
print(tuples_list)

Output:

[('2022-10-03', 141.42999267578125), ('2022-10-04', 143.99000549316406),
 ('2022-10-04', 143.99000549316406), ('2022-10-05', 142.66000366210938),
 ('2022-10-05', 142.66000366210938), ('2022-10-06', 143.4010009765625),
 ('2022-10-06', 143.4010009765625), ('2022-10-07', 141.45599365234375)]
Answered By: BeRT2me
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.