Python 'map' function inserting NaN, possible to return original values instead?

Question:

I am passing a dictionary to the map function to recode values in the column of a Pandas dataframe. However, I noticed that if there is a value in the original series that is not explicitly in the dictionary, it gets recoded to NaN. Here is a simple example:

Typing…

s = pd.Series(['one','two','three','four'])

…creates the series

0      one
1      two
2    three
3     four
dtype: object

But applying the map…

recodes = {'one':'A', 'two':'B', 'three':'C'}
s.map(recodes)

…returns the series

0      A
1      B
2      C
3    NaN
dtype: object

I would prefer that if any element in series s is not in the recodes dictionary, it remains unchanged. That is, I would prefer to return the series below (with the original four instead of NaN).

0      A
1      B
2      C
3   four
dtype: object

Is there an easy way to do this, for example an option to pass to the map function? The challenge I am having is that I can’t always anticipate all possible values that will be in the series I’m recoding – the data will be updated in the future and new values could appear.

Thanks!

Asked By: atkat12

||

Answers:

Use replace instead of map:

>>> s = pd.Series(['one','two','three','four'])
>>> recodes = {'one':'A', 'two':'B', 'three':'C'}
>>> s.map(recodes)
0      A
1      B
2      C
3    NaN
dtype: object
>>> s.replace(recodes)
0       A
1       B
2       C
3    four
dtype: object
Answered By: DSM

If you still want to use map the map function (can be faster than replace in some cases), you can define missing values:

class MyDict(dict):
def __missing__(self, key):
    return key

s = pd.Series(['one', 'two', 'three', 'four'])

recodes = MyDict({
'one':'A',
'two':'B',
'three':'C'
})

s.map(recodes)

0       A
1       B
2       C
3    four
dtype: object
Answered By: gio_geh
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.