How to replace a column in a DataFrame with a column of tuples

Question:

So I’ve got an integer Series and I want to transform it into a Series of tuples where the dictionary is the transformation.

The size of Data is big so speed is important

The relationship between numbers here is not relevant to the problem (1 -> (any tuple))

int_series = pd.Series([1, 2, 3, 1, 5])

replacement_dict = {
    1: (1, 11),
    2: (2, 22),
    3: (3, 33),
    5: (5, 55)
}

# Expected output
0    (1, 11)
1    (2, 22)
2    (3, 33)
3    (1, 11)
4    (5, 55)
dtype: object

Using replace has an unexpected (to me) output. Where it iterates based on row over indexes of the tuple

# Using replace
tuple_series = int_series.replace(replacement_dict)

print(tuple_series)
# output
0     1
1    22
2     3
3    11
4     5
dtype: int64

So I know I can do this by list packing, but I was wondering if a better solution exists.

# List packing solution
tuple_series = pd.Series([replacement_dict.get(value) for value in int_series.to_numpy()])

It’s not actually that important that a tuple is preserved, only that the information inside of it is held inside something that can be inserted into a np.ndarry. (i.e. if a solution exists with lists or some other object that is quicker, then that is also acceptable as a solution)

Asked By: bob marley

||

Answers:

Try using map:

int_series.map(replacement_dict)

Output:

0    (1, 11)
1    (2, 22)
2    (3, 33)
3    (1, 11)
4    (5, 55)
dtype: object
Answered By: Scott Boston

Use pd.Series.map

int_series.map(replacement_dict)
Answered By: SomeDude