How can I duplicate the same row but with different information in a column in pandas?
Question:
My pandas dataframe looks like this:
import pandas as pd
table = pd.DataFrame({'Range': ["A", "B", "C", "A"],'First Name': ["W","X","Y", "Z"], 'ID': [1,2,3,4]})
I want to replicate the same row if in the "Level" column I have the text "A", but in the "Activity" column add the text "Monitoring" and "Informant", something like this:
I tried to make the duplicate with this code
columns_new = pd.DataFrame(columns=["NO ID","Level", "Name", "Activity"])
row_modified = []
for index, row in table.iterrows():
rang = row['Range']
f_name= row['First Name']
n_id = row['ID']
columns_new.loc[index, "NO ID"] = n_id
columns_new.loc[index, "Level"] = rang
columns_new.loc[index, "Name"] = f_name
if rang == "A":
row_modified.append(row)
row_modified.append(row)
else:
row_modified.append(row)
column_new2 = pd.DataFrame(row_modified)
column_new2
But I don’t know how to add the texts I want
Answers:
You can use a mapping dict:
d = {'A': ['Monitoring', 'Informant']}
out = (table.assign(Activity=table['Range'].map(d).fillna('Assistant'))
.explode('Activity'))
print(out)
# Output
Range First Name ID Activity
0 A W 1 Monitoring
0 A W 1 Informant
1 B X 2 Assistant
2 C Y 3 Assistant
3 A Z 4 Monitoring
3 A Z 4 Informant
You can use a merge
:
s = pd.Series(['Monitoring', 'Informant'], index=['A', 'A'], name='Activity')
(table.merge(s, left_on='Range', right_index=True, how='left')
.fillna({'Activity': 'Assistant'})
)
Output:
Range First Name ID Activity
0 A W 1 Monitoring
0 A W 1 Informant
1 B X 2 Assistant
2 C Y 3 Assistant
3 A Z 4 Monitoring
3 A Z 4 Informant
My pandas dataframe looks like this:
import pandas as pd
table = pd.DataFrame({'Range': ["A", "B", "C", "A"],'First Name': ["W","X","Y", "Z"], 'ID': [1,2,3,4]})
I want to replicate the same row if in the "Level" column I have the text "A", but in the "Activity" column add the text "Monitoring" and "Informant", something like this:
I tried to make the duplicate with this code
columns_new = pd.DataFrame(columns=["NO ID","Level", "Name", "Activity"])
row_modified = []
for index, row in table.iterrows():
rang = row['Range']
f_name= row['First Name']
n_id = row['ID']
columns_new.loc[index, "NO ID"] = n_id
columns_new.loc[index, "Level"] = rang
columns_new.loc[index, "Name"] = f_name
if rang == "A":
row_modified.append(row)
row_modified.append(row)
else:
row_modified.append(row)
column_new2 = pd.DataFrame(row_modified)
column_new2
But I don’t know how to add the texts I want
You can use a mapping dict:
d = {'A': ['Monitoring', 'Informant']}
out = (table.assign(Activity=table['Range'].map(d).fillna('Assistant'))
.explode('Activity'))
print(out)
# Output
Range First Name ID Activity
0 A W 1 Monitoring
0 A W 1 Informant
1 B X 2 Assistant
2 C Y 3 Assistant
3 A Z 4 Monitoring
3 A Z 4 Informant
You can use a merge
:
s = pd.Series(['Monitoring', 'Informant'], index=['A', 'A'], name='Activity')
(table.merge(s, left_on='Range', right_index=True, how='left')
.fillna({'Activity': 'Assistant'})
)
Output:
Range First Name ID Activity
0 A W 1 Monitoring
0 A W 1 Informant
1 B X 2 Assistant
2 C Y 3 Assistant
3 A Z 4 Monitoring
3 A Z 4 Informant