Use pandas to create dict out of CSV columns

Question:

I have a csv file(with single line) like this

drop1,drop2,key1,value1,key2,value2,key3,value3...keyN,valueN

The output I need is

{
'key1':'value1',
'key2':'value2',
..
'keyN':'valueN',
}

I intend to use dataframes to do.
I tried using reshape and pivot, but being new to pandas, I am not able to figure it out.

Any pointer will be great help .

Asked By: Soumya

||

Answers:

IIUC:

df = pd.read_csv('your.csv', header=None)

lst = list(df)

# remove the strings with 'drop' in it
lst = [s for s in lst if 'drop' not in s]

# create key/value list based on lst
keys = [s for s in lst if 'key' in s]
value = [s for s in lst if 'val' in s]

# create dictionary using zip
d = dict(zip(keys, value))

Output:

{'key1': 'value1', 'key2': 'value2', 'key3': 'value3', 'keyN': 'valueN'}
Answered By: Tobias Molenaar

You can try reshape the values after first two columns to shape (-1, 2) where first column is key and second column is value

df = pd.read_csv('your.csv', header=None)
out = (pd.DataFrame(df.iloc[:, 2:].values.reshape(-1, 2))
       .set_index(0)[1].to_dict())
print(df)

       0      1     2       3     4       5     6       7     8       9
0  drop1  drop2  key1  value1  key2  value2  key3  value3  keyN  valueN

print(out)

{'key1': 'value1', 'key2': 'value2', 'key3': 'value3', 'keyN': 'valueN'}
Answered By: Ynjxsjmh

Try this method –

  1. Read the csv without headers and pick only the first row
  2. Filter this series based on "drop" value occurring (or any other condition)
  3. Reshape it to key, value shaped array and convert to dict
import pandas as pd

s = pd.read_csv("test.csv",header=None).iloc[0] #read csv without headers and pickup first row as a series

drop_idx = ~s.str.match("drop") #find values that contain "drop" or any other condition
arr = s[drop_idx].to_numpy().reshape(-1,2) #reshape the series into keys, values
output = dict(arr) #convert to dict
print(output)
{'key1': 'value1', 'key2': 'value2', 'key3': 'value3', 'keyN': 'valueN'}
Answered By: Akshay Sehgal

If key and values are not ordered:

cols = df.iloc[0].to_list()

keys = sorted([val[-1] for val in cols if val.startswith('key')])
values = sorted([val[-1] for val in cols if val.startswith('val')])

my_dict = {f'key{key}': f'val{val}' for key, val in zip(keys, values)}
print(my_dict)
Answered By: Muhammad Hassan

Without numpy and comprehensions:

s = pd.read_csv(r'c:testtest111111.txt', header=None).iloc[0, 2:]  # get the Series without first two elements
print(s[1::2].set_axis(s[::2]).to_dict())  # get odd elements (values) and make index from even elements (keys)

Prints:

{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}
Answered By: Алексей Р
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.