Create different dataframe inside of a 'for' loop
Question:
I have a dataset that looks something like the following. I would like to create dataframes that contains only texts for each authors, for example as you can see the df1 contains only texts from the author0, etc. Is there any way to do that for many authors?
import pandas as pd
data = {
'text' : ['text0', 'text1', 'text2'],
'author': ['author0', 'author1', 'author1'],
'title': ['Comunicación', 'Administración', 'Ventas']
}
df = pd.DataFrame(data)
df1 = df[df["author"]=="author0"]
df2 = df[df["author"]=="author1"]
I have tried this, but it’s not working
list_author = df['author'].unique().tolist()
for i in list_author:
dt_str(i) = dt[dt["author"]=="i"]
It would be helpful if the data frames have the name df_’author’ (eg df_George)
Answers:
If you want to have separate dataframes per author, use a dictionary
with the author names as the keys. See the below example:
data = {
'text' : ['text0', 'text1', 'text2'],
'author': ['author0', 'author1', 'author1'],
'title': ['Comunicación', 'Administración', 'Ventas']
}
df = pd.DataFrame(data)
df_dict = {}
for author in df['author'].unique():
df_dict[author] = df[df['author']==author]
print(df_dict.keys())
#dict_keys(['author0', 'author1'])
print(df_dict['author0'])
# text author title
# 0 text0 author0 Comunicación
print(df_dict['author1'])
# text author title
# 1 text1 author1 Administración
# 2 text2 author1 Ventas
I have a dataset that looks something like the following. I would like to create dataframes that contains only texts for each authors, for example as you can see the df1 contains only texts from the author0, etc. Is there any way to do that for many authors?
import pandas as pd
data = {
'text' : ['text0', 'text1', 'text2'],
'author': ['author0', 'author1', 'author1'],
'title': ['Comunicación', 'Administración', 'Ventas']
}
df = pd.DataFrame(data)
df1 = df[df["author"]=="author0"]
df2 = df[df["author"]=="author1"]
I have tried this, but it’s not working
list_author = df['author'].unique().tolist()
for i in list_author:
dt_str(i) = dt[dt["author"]=="i"]
It would be helpful if the data frames have the name df_’author’ (eg df_George)
If you want to have separate dataframes per author, use a dictionary
with the author names as the keys. See the below example:
data = {
'text' : ['text0', 'text1', 'text2'],
'author': ['author0', 'author1', 'author1'],
'title': ['Comunicación', 'Administración', 'Ventas']
}
df = pd.DataFrame(data)
df_dict = {}
for author in df['author'].unique():
df_dict[author] = df[df['author']==author]
print(df_dict.keys())
#dict_keys(['author0', 'author1'])
print(df_dict['author0'])
# text author title
# 0 text0 author0 Comunicación
print(df_dict['author1'])
# text author title
# 1 text1 author1 Administración
# 2 text2 author1 Ventas