How to create a list from Pandas Series?
Question:
My dataframe looks like this. I am trying to create a list of Names. For ex: ["Mike", "Jean"]
:
0. Mike, Jean
1. May, Weather
2. Jack, 100
What I’ve tried:
df["NAME"] = df["NAME"].str.split(",")
for i in range(len(df["NAME"])):
df["NAME"][i] = df["NAME"][i] .split(",")
OUTPUT
0. [Mike, Jean]
1. [May, Weather]
2. [Jack, 100]
OUTPUT I WANT
0. ["Mike", "Jean"]
1. ["May", "Weather"]
2. ["Jack", "100"]
I am new to Python and Pandas. Any help will be appreciated!
Answers:
You can iterate over the values of your dataframe and easily convert the rows of the resulting Numpy array into lists (cf example code below)
import pandas as pd
df = pd.DataFrame()
df['Name'] = ['Name1', 'Name2']
df['FirstName'] = ['FirstName1', 'FirstName2']
L = []
for row in df.values:
L.append(list(row))
print(L)
Cheers
You don’t really need to use a for-loop, you can do the split with:
df['Name'] = df['Name'].str.split()
This will return a pandas series containing a list per row, such as:
0 ["Mike", "Jean"]
1 ["May", "Weather"]
2 ["Jack", "100"]
If you wish to extract the Series’ values as list itself then you can use:
name_lists = df['Name'].str.split().values.tolist()
Returning:
[["Mike","Jean"],["May","Weather"],["Jack","100"]]
Assuming this input:
df = pd.DataFrame({'Name': ['Mike, Jean', 'May, Weather', 'Jack, 100']})
When you run:
df['Name'].str.split(', ')
and get:
0 [Mike, Jean]
1 [May, Weather]
2 [Jack, 100]
Name: Name, dtype: object
The [Mike, Jean]
format is just a representation.
The real data is indeed a Series of lists, as show by an explicit conversion of the Series to list:
df['Name'].str.split(', ').to_list()
output:
[['Mike', 'Jean'],
['May', 'Weather'],
['Jack', '100']]
Code snippet below should solve your purpose 🙂
import pandas as pd
df = pd.DataFrame(["[Mike, Jean]" , "[May, Weather]", "[Jack, 100]"], columns=['name'])
df.head()
name
0 [Mike, Jean]
1 [May, Weather]
2 [Jack, 100]
df['type_name'] = df.apply(lambda y: type(y['name']), axis=1)
df['name1'] = df.apply(lambda y: y['name'].replace('[', '').replace(']', '').split(", "), axis=1)
df['type_name1'] = df.apply(lambda y: type(y['name1']), axis=1)
df.head()
name type_name name1 type_name1
0 [Mike, Jean] <class 'str'> [Mike, Jean] <class 'list'>
1 [May, Weather] <class 'str'> [May, Weather] <class 'list'>
2 [Jack, 100] <class 'str'> [Jack, 100] <class 'list'>
final_list = df['name1'].values.tolist()
print(final_list)
[['Mike', 'Jean'], ['May', 'Weather'], ['Jack', '100']]
My dataframe looks like this. I am trying to create a list of Names. For ex: ["Mike", "Jean"]
:
0. Mike, Jean
1. May, Weather
2. Jack, 100
What I’ve tried:
df["NAME"] = df["NAME"].str.split(",")
for i in range(len(df["NAME"])):
df["NAME"][i] = df["NAME"][i] .split(",")
OUTPUT
0. [Mike, Jean]
1. [May, Weather]
2. [Jack, 100]
OUTPUT I WANT
0. ["Mike", "Jean"]
1. ["May", "Weather"]
2. ["Jack", "100"]
I am new to Python and Pandas. Any help will be appreciated!
You can iterate over the values of your dataframe and easily convert the rows of the resulting Numpy array into lists (cf example code below)
import pandas as pd
df = pd.DataFrame()
df['Name'] = ['Name1', 'Name2']
df['FirstName'] = ['FirstName1', 'FirstName2']
L = []
for row in df.values:
L.append(list(row))
print(L)
Cheers
You don’t really need to use a for-loop, you can do the split with:
df['Name'] = df['Name'].str.split()
This will return a pandas series containing a list per row, such as:
0 ["Mike", "Jean"]
1 ["May", "Weather"]
2 ["Jack", "100"]
If you wish to extract the Series’ values as list itself then you can use:
name_lists = df['Name'].str.split().values.tolist()
Returning:
[["Mike","Jean"],["May","Weather"],["Jack","100"]]
Assuming this input:
df = pd.DataFrame({'Name': ['Mike, Jean', 'May, Weather', 'Jack, 100']})
When you run:
df['Name'].str.split(', ')
and get:
0 [Mike, Jean]
1 [May, Weather]
2 [Jack, 100]
Name: Name, dtype: object
The [Mike, Jean]
format is just a representation.
The real data is indeed a Series of lists, as show by an explicit conversion of the Series to list:
df['Name'].str.split(', ').to_list()
output:
[['Mike', 'Jean'],
['May', 'Weather'],
['Jack', '100']]
Code snippet below should solve your purpose 🙂
import pandas as pd
df = pd.DataFrame(["[Mike, Jean]" , "[May, Weather]", "[Jack, 100]"], columns=['name'])
df.head()
name
0 [Mike, Jean]
1 [May, Weather]
2 [Jack, 100]
df['type_name'] = df.apply(lambda y: type(y['name']), axis=1)
df['name1'] = df.apply(lambda y: y['name'].replace('[', '').replace(']', '').split(", "), axis=1)
df['type_name1'] = df.apply(lambda y: type(y['name1']), axis=1)
df.head()
name type_name name1 type_name1
0 [Mike, Jean] <class 'str'> [Mike, Jean] <class 'list'>
1 [May, Weather] <class 'str'> [May, Weather] <class 'list'>
2 [Jack, 100] <class 'str'> [Jack, 100] <class 'list'>
final_list = df['name1'].values.tolist()
print(final_list)
[['Mike', 'Jean'], ['May', 'Weather'], ['Jack', '100']]