Is it possible to use DataFrame.replace in a dataframe with tuples?
Question:
I tried the following code:
import pandas as pd
toy_df = pd.DataFrame({'col0': [('Hello','Sarah!'),('Bye', 'my', 'dear', 'Beth')]})
toy_dict = {('Hello','Sarah!') : ('oh,', 'you', 'again'), ('Bye', 'my', 'dear', 'Beth'):('xoxo',)}
print(toy_df.replace(toy_dict))
But I got TypeError: Cannot compare types 'ndarray(dtype=object)' and 'tuple'
.
From what I was able to understand, tuples inside dataframes are treated as objects, not tuples. How can I proceed?
Answers:
You can use applymap function:
import pandas as pd
toy_df = pd.DataFrame({'col0': [('Hello', 'Sarah!'), ('Bye', 'my', 'dear', 'Beth')]})
toy_dict = {('Hello', 'Sarah!'): ('oh,', 'you', 'again'), ('Bye', 'my', 'dear', 'Beth'): ('xoxo',)}
new_toy_df = toy_df.applymap(lambda x: toy_dict[x] if x in toy_dict else x)
print(new_toy_df)
output:
col0
0 (oh,, you, again)
1 (xoxo,)
Another possible solution, which replaces with NaN
any values not found in the dictionary:
toy_df.assign(col0 = [toy_dict.get(x, np.nan) for x in toy_df['col0']])
Output:
col0
0 (oh,, you, again)
1 (xoxo,)
I tried the following code:
import pandas as pd
toy_df = pd.DataFrame({'col0': [('Hello','Sarah!'),('Bye', 'my', 'dear', 'Beth')]})
toy_dict = {('Hello','Sarah!') : ('oh,', 'you', 'again'), ('Bye', 'my', 'dear', 'Beth'):('xoxo',)}
print(toy_df.replace(toy_dict))
But I got TypeError: Cannot compare types 'ndarray(dtype=object)' and 'tuple'
.
From what I was able to understand, tuples inside dataframes are treated as objects, not tuples. How can I proceed?
You can use applymap function:
import pandas as pd
toy_df = pd.DataFrame({'col0': [('Hello', 'Sarah!'), ('Bye', 'my', 'dear', 'Beth')]})
toy_dict = {('Hello', 'Sarah!'): ('oh,', 'you', 'again'), ('Bye', 'my', 'dear', 'Beth'): ('xoxo',)}
new_toy_df = toy_df.applymap(lambda x: toy_dict[x] if x in toy_dict else x)
print(new_toy_df)
output:
col0
0 (oh,, you, again)
1 (xoxo,)
Another possible solution, which replaces with NaN
any values not found in the dictionary:
toy_df.assign(col0 = [toy_dict.get(x, np.nan) for x in toy_df['col0']])
Output:
col0
0 (oh,, you, again)
1 (xoxo,)