pandas.Series.map depending on order of items in dictionary?
Question:
I tried to map the data types of a pandas DataFrame to different names using the map method. It works for 2 out of 3 permutations of the data types within the dictionary argument to map. But the 3rd one is ignoring ‘int64’. Is the order of the dictionary keys supposed to matter or what am I missing here?
import pandas as pd
df = pd.DataFrame({
'x': [1,2,3],
'y': [1.0, 2.2, 3.5],
'z': ['one', 'two', 'three']
})
df.dtypes
df.dtypes.map({'int64': 'integer', 'float64': 'decimal', 'object': 'character'}) # works
df.dtypes.map({'object': 'character', 'float64': 'decimal', 'int64': 'integer'}) # works
df.dtypes.map({'float64': 'decimal', 'int64': 'integer', 'object': 'character'}) # NaN for x
Answers:
The data in the Series returned by dtypes
is not of type str
but rather dtype
, which seems to cause non-deterministic behavior in map
when it is passed a dict with keys of type str
.
A way to clear this up is to use astype(str)
on dtypes
:
print( df.dtypes.astype(str).map({'int64': 'integer', 'float64': 'decimal', 'object': 'character'}) )
print( df.dtypes.astype(str).map({'object': 'character', 'float64': 'decimal', 'int64': 'integer'}) )
print( df.dtypes.astype(str).map({'float64': 'decimal', 'int64': 'integer', 'object': 'character'}) )
Output:
x integer
y decimal
z character
dtype: object
x integer
y decimal
z character
dtype: object
x integer
y decimal
z character
dtype: object
I tried to map the data types of a pandas DataFrame to different names using the map method. It works for 2 out of 3 permutations of the data types within the dictionary argument to map. But the 3rd one is ignoring ‘int64’. Is the order of the dictionary keys supposed to matter or what am I missing here?
import pandas as pd
df = pd.DataFrame({
'x': [1,2,3],
'y': [1.0, 2.2, 3.5],
'z': ['one', 'two', 'three']
})
df.dtypes
df.dtypes.map({'int64': 'integer', 'float64': 'decimal', 'object': 'character'}) # works
df.dtypes.map({'object': 'character', 'float64': 'decimal', 'int64': 'integer'}) # works
df.dtypes.map({'float64': 'decimal', 'int64': 'integer', 'object': 'character'}) # NaN for x
The data in the Series returned by dtypes
is not of type str
but rather dtype
, which seems to cause non-deterministic behavior in map
when it is passed a dict with keys of type str
.
A way to clear this up is to use astype(str)
on dtypes
:
print( df.dtypes.astype(str).map({'int64': 'integer', 'float64': 'decimal', 'object': 'character'}) )
print( df.dtypes.astype(str).map({'object': 'character', 'float64': 'decimal', 'int64': 'integer'}) )
print( df.dtypes.astype(str).map({'float64': 'decimal', 'int64': 'integer', 'object': 'character'}) )
Output:
x integer
y decimal
z character
dtype: object
x integer
y decimal
z character
dtype: object
x integer
y decimal
z character
dtype: object