Pandas: make new column and return list of values from other column
Question:
Suppose I have two columns
d = {'a_lower': [1, 2], 'a_upper': [3, 4]}
df = pd.DataFrame(data=d)
a_lower a_upper
0 1 3
1 2 4
I want to have 3rd column that returns a list from the values of the two columns
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
I tried this
df['a'] = [df['a_lower'], df['a_upper']]
I got different result
a_lower a_upper a
0 1 3 0 1 1 2 Name: a_lower, dtype: int64
1 2 4 0 3 1 4 Name: a_upper, dtype: int64
How to do if correcty? I am trying to return an array of dict oriented in ‘records’
[{'a_lower': 1,
'a_upper': 3,
'a': [1, 3]
},
...
]
Answers:
Use list comprehension:
df['a'] = [[a,b] for a,b in zip(df['a_lower'], df['a_upper'])]
print (df)
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
Or select only both columns and convert to numpy array by DataFrame.to_numpy
and then to lists:
df['a'] = df[['a_lower','a_upper']].to_numpy().tolist()
#if only 2 columns DataFrame
#df['a'] = df.to_numpy().tolist()
print (df)
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
Or use DataFrame.to_dict
with split
parameter and select data
key:
df['a'] = df[['a_lower','a_upper']].to_dict('split')['data']
print (df)
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
print (df[['a_lower','a_upper']].to_dict('split'))
{'index': [0, 1], 'columns': ['a_lower', 'a_upper'], 'data': [[1, 3], [2, 4]]}
df['a'] = df.to_numpy().tolist()
or to limit to specific columns:
df['a'] = df[['a_lower', 'a_upper']].to_numpy().tolist()
Alternative, using a dictionary with to_dict
:
df['a'] = df[['a_lower', 'a_upper']].to_dict('list').values()
Output:
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
Suppose I have two columns
d = {'a_lower': [1, 2], 'a_upper': [3, 4]}
df = pd.DataFrame(data=d)
a_lower a_upper
0 1 3
1 2 4
I want to have 3rd column that returns a list from the values of the two columns
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
I tried this
df['a'] = [df['a_lower'], df['a_upper']]
I got different result
a_lower a_upper a
0 1 3 0 1 1 2 Name: a_lower, dtype: int64
1 2 4 0 3 1 4 Name: a_upper, dtype: int64
How to do if correcty? I am trying to return an array of dict oriented in ‘records’
[{'a_lower': 1,
'a_upper': 3,
'a': [1, 3]
},
...
]
Use list comprehension:
df['a'] = [[a,b] for a,b in zip(df['a_lower'], df['a_upper'])]
print (df)
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
Or select only both columns and convert to numpy array by DataFrame.to_numpy
and then to lists:
df['a'] = df[['a_lower','a_upper']].to_numpy().tolist()
#if only 2 columns DataFrame
#df['a'] = df.to_numpy().tolist()
print (df)
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
Or use DataFrame.to_dict
with split
parameter and select data
key:
df['a'] = df[['a_lower','a_upper']].to_dict('split')['data']
print (df)
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]
print (df[['a_lower','a_upper']].to_dict('split'))
{'index': [0, 1], 'columns': ['a_lower', 'a_upper'], 'data': [[1, 3], [2, 4]]}
df['a'] = df.to_numpy().tolist()
or to limit to specific columns:
df['a'] = df[['a_lower', 'a_upper']].to_numpy().tolist()
Alternative, using a dictionary with to_dict
:
df['a'] = df[['a_lower', 'a_upper']].to_dict('list').values()
Output:
a_lower a_upper a
0 1 3 [1, 3]
1 2 4 [2, 4]