Python: Loop through tuple and insert into Data Frame for each value
Question:
Trying to loop through the tuple that is currently a column in my data frame. For the first ID I want to select the first item in the group tuple then for the second ID select the second variable in the tuple. For the remaining ID’s in the same group I would like to cycle back through the tuple.
If the group changes I would like to repeat the process with the new group. I’m also fine with splitting it into a new data frame and then union the results back in later.
df = pd.DataFrame({'ID':[1,2,3,4,5,6],
'Group':["('Cat','Dog')",
"('Cat','Dog')",
"('Cat','Dog')",
"('Cat','Dog')",
"('Bird','Dog')",
"('Bird','Dog')",
]
})
ID
Group
1
(‘Cat’, ‘Dog’)
2
(‘Cat’, ‘Dog’)
3
(‘Cat’, ‘Dog’)
4
(‘Cat’, ‘Dog’)
5
(‘Bird’, ‘Dog’)
6
(‘Bird’, ‘Dog’)
ID
Group
1
Cat
2
Dog
3
Cat
4
Dog
5
Bird
6
Dog
Answers:
Assuming a column of tuples:
df['Group'] = (df.groupby(df['ID'].sub(1).mod(2))['Group']
.transform(lambda s: s.str[s.name])
)
If you have strings:
from ast import literal_eval
df['Group'] = (df['Group'].apply(literal_eval)
.groupby(df['ID'].sub(1).mod(2))
.transform(lambda s: s.str[s.name])
)
Output:
ID Group
0 1 Cat
1 2 Dog
2 3 Cat
3 4 Dog
4 5 Bird
5 6 Dog
Just do this
# Using lambda function we apply eval() to evaluate arbitrary expressions from a string-based input
df['Group'] = df['Group'].apply(lambda x: eval(x))
# Here we loop through and insert the values into the dataframe
df['Group'] = df.apply(lambda x: x['Group'][x['ID'] % 2 == 0], axis=1)
# Create a copy of the original dataframe and assign it to temp_df
temp_df = df.copy()
# Explode the 'Group' column in the temp_df dataframe
temp_df = temp_df.explode('Group').reset_index(drop=True)
# Strip the brackets and split the values in the 'Group' column
temp_df['Group'] = temp_df['Group'].str.strip("()").str.split(",")
# Explode the 'Group' column again to create one row per value
temp_df = temp_df.explode('Group').reset_index(drop=True)
# Strip the single quotes from the values in the 'Group' column
temp_df['Group'] = temp_df['Group'].str.strip("'")
# Print the modified temp_df dataframe
print(temp_df)
Trying to loop through the tuple that is currently a column in my data frame. For the first ID I want to select the first item in the group tuple then for the second ID select the second variable in the tuple. For the remaining ID’s in the same group I would like to cycle back through the tuple.
If the group changes I would like to repeat the process with the new group. I’m also fine with splitting it into a new data frame and then union the results back in later.
df = pd.DataFrame({'ID':[1,2,3,4,5,6],
'Group':["('Cat','Dog')",
"('Cat','Dog')",
"('Cat','Dog')",
"('Cat','Dog')",
"('Bird','Dog')",
"('Bird','Dog')",
]
})
ID | Group |
---|---|
1 | (‘Cat’, ‘Dog’) |
2 | (‘Cat’, ‘Dog’) |
3 | (‘Cat’, ‘Dog’) |
4 | (‘Cat’, ‘Dog’) |
5 | (‘Bird’, ‘Dog’) |
6 | (‘Bird’, ‘Dog’) |
ID | Group |
---|---|
1 | Cat |
2 | Dog |
3 | Cat |
4 | Dog |
5 | Bird |
6 | Dog |
Assuming a column of tuples:
df['Group'] = (df.groupby(df['ID'].sub(1).mod(2))['Group']
.transform(lambda s: s.str[s.name])
)
If you have strings:
from ast import literal_eval
df['Group'] = (df['Group'].apply(literal_eval)
.groupby(df['ID'].sub(1).mod(2))
.transform(lambda s: s.str[s.name])
)
Output:
ID Group
0 1 Cat
1 2 Dog
2 3 Cat
3 4 Dog
4 5 Bird
5 6 Dog
Just do this
# Using lambda function we apply eval() to evaluate arbitrary expressions from a string-based input
df['Group'] = df['Group'].apply(lambda x: eval(x))
# Here we loop through and insert the values into the dataframe
df['Group'] = df.apply(lambda x: x['Group'][x['ID'] % 2 == 0], axis=1)
# Create a copy of the original dataframe and assign it to temp_df
temp_df = df.copy()
# Explode the 'Group' column in the temp_df dataframe
temp_df = temp_df.explode('Group').reset_index(drop=True)
# Strip the brackets and split the values in the 'Group' column
temp_df['Group'] = temp_df['Group'].str.strip("()").str.split(",")
# Explode the 'Group' column again to create one row per value
temp_df = temp_df.explode('Group').reset_index(drop=True)
# Strip the single quotes from the values in the 'Group' column
temp_df['Group'] = temp_df['Group'].str.strip("'")
# Print the modified temp_df dataframe
print(temp_df)