Parse pandas (multi)index to datetime
Question:
I have multi-index df as follows
x y
id date
abc 3/1/1994 100 7
9/1/1994 90 8
3/1/1995 80 9
Where dates are stored as str.
I want to parse date index. The following statement
df.index.levels[1] = pd.to_datetime(df.index.levels[1])
returns error:
TypeError: 'FrozenList' does not support mutable operations.
Answers:
One cannot modify MultiIndex
in-place, so we have to recreate it. To do so, we use get_level_values
to obtain multiindex levels as Series
, then apply pd.to_datefime
, then reconstruct multiindex from two levels.
index = pd.MultiIndex.from_tuples([('abc', '3/1/1994'), ('abc', '9/1/1994')],
names=('id', 'date'))
df = pd.DataFrame({'x': [1, 2]}, index=index)
print(df.index.get_level_values(level=1).dtype)
# object
df.index = pd.MultiIndex.from_arrays([index.get_level_values(level=0),
pd.to_datetime(
index.get_level_values(level=1))])
print(df.index.get_level_values(level=1).dtype)
# datetime64[ns]
You cannot modify it in-place. You can use pandas.MultiIndex.map to create a new index and then assign it:
new_tuples = df.index.map(lambda x: (x[0], pd.to_datetime(x[1])))
df.index = pd.MultiIndex.from_tuples(new_tuples, names=["id", "date"])
As mentioned, you have to recreate the index:
df.index = df.index.set_levels([df.index.levels[0], pd.to_datetime(df.index.levels[1])])
You can use MultiIndex.set_levels
with argument level
df.index = df.index.set_levels(pd.to_datetime(df.index.levels[1]), level=1)
# or if you want to use the level name
df.index = df.index.set_levels(pd.to_datetime(df.index.get_level_values('date')), level='date')
print(df)
x y
id date
abc 1994-03-01 100 7
1995-03-01 90 8
1994-09-01 80 9
I have multi-index df as follows
x y
id date
abc 3/1/1994 100 7
9/1/1994 90 8
3/1/1995 80 9
Where dates are stored as str.
I want to parse date index. The following statement
df.index.levels[1] = pd.to_datetime(df.index.levels[1])
returns error:
TypeError: 'FrozenList' does not support mutable operations.
One cannot modify MultiIndex
in-place, so we have to recreate it. To do so, we use get_level_values
to obtain multiindex levels as Series
, then apply pd.to_datefime
, then reconstruct multiindex from two levels.
index = pd.MultiIndex.from_tuples([('abc', '3/1/1994'), ('abc', '9/1/1994')],
names=('id', 'date'))
df = pd.DataFrame({'x': [1, 2]}, index=index)
print(df.index.get_level_values(level=1).dtype)
# object
df.index = pd.MultiIndex.from_arrays([index.get_level_values(level=0),
pd.to_datetime(
index.get_level_values(level=1))])
print(df.index.get_level_values(level=1).dtype)
# datetime64[ns]
You cannot modify it in-place. You can use pandas.MultiIndex.map to create a new index and then assign it:
new_tuples = df.index.map(lambda x: (x[0], pd.to_datetime(x[1])))
df.index = pd.MultiIndex.from_tuples(new_tuples, names=["id", "date"])
As mentioned, you have to recreate the index:
df.index = df.index.set_levels([df.index.levels[0], pd.to_datetime(df.index.levels[1])])
You can use MultiIndex.set_levels
with argument level
df.index = df.index.set_levels(pd.to_datetime(df.index.levels[1]), level=1)
# or if you want to use the level name
df.index = df.index.set_levels(pd.to_datetime(df.index.get_level_values('date')), level='date')
print(df)
x y
id date
abc 1994-03-01 100 7
1995-03-01 90 8
1994-09-01 80 9