Pandas: Extracting values from a DatetimeIndex
Question:
I have a Pandas DataFrame whose rows and columns are a DatetimeIndex.
import pandas as pd
data = pd.DataFrame(
{
"PERIOD_END_DATE": pd.date_range(start="2018-01", end="2018-04", freq="M"),
"first": list("abc"),
"second": list("efg")
}
).set_index("PERIOD_END_DATE")
data.columns = pd.date_range(start="2018-01", end="2018-03", freq="M")
data
Unfortunately, I am getting a variety of errors when I try to pull out a value:
data['2018-01', '2018-02'] # InvalidIndexError: ('2018-01', '2018-02')
data['2018-01', ['2018-02']] # InvalidIndexError: ('2018-01', ['2018-02'])
data.loc['2018-01', '2018-02'] # TypeError: only integer scalar arrays can be converted to a scalar index
data.loc['2018-01', ['2018-02']] # KeyError: "None of [Index(['2018-02'], dtype='object')] are in the [columns]"
How do I extract a value from a DataFrame that uses a DatetimeIndex?
Answers:
There are 2 issues:
- Since, you are using a DateTimeIndex dataframe, the correct notation to traverse between rows and columns are:
a) data.loc[rows_index_name, [column__index_name]]
or
b) data.loc[rows_index_name, column__index_name]
depending on the type of output you desire.
Notation A will return a series value, while notation (b) returns a string value.
- The index names can not be amputated- you must specify the whole string.
As such, your issue will be resolved with:
data.loc['2018-01-31',['2018-01-31']] or data.loc['2018-01-31','2018-01-31']
As long as you already set the date as index, you will not be able to slice or extract any data of it. You can extract the month and date of it as it is a regular column not when it is an index. I had this before and that was the solution.
I kept it as a regular column, extracted the Month, Day and Year as a seperate column for each of them, then I assigned the date column as the index column.
you are accessing as a period (YYYY-MM) on a date columns.
This would help in this case
data.columns = pd.period_range(start="2018-01", end="2018-02", freq='M')
data[['2018-01']]
2018-01
PERIOD_END_DATE
2018-01-31 a
2018-02-28 b
2018-03-31 c
I have a Pandas DataFrame whose rows and columns are a DatetimeIndex.
import pandas as pd
data = pd.DataFrame(
{
"PERIOD_END_DATE": pd.date_range(start="2018-01", end="2018-04", freq="M"),
"first": list("abc"),
"second": list("efg")
}
).set_index("PERIOD_END_DATE")
data.columns = pd.date_range(start="2018-01", end="2018-03", freq="M")
data
Unfortunately, I am getting a variety of errors when I try to pull out a value:
data['2018-01', '2018-02'] # InvalidIndexError: ('2018-01', '2018-02')
data['2018-01', ['2018-02']] # InvalidIndexError: ('2018-01', ['2018-02'])
data.loc['2018-01', '2018-02'] # TypeError: only integer scalar arrays can be converted to a scalar index
data.loc['2018-01', ['2018-02']] # KeyError: "None of [Index(['2018-02'], dtype='object')] are in the [columns]"
How do I extract a value from a DataFrame that uses a DatetimeIndex?
There are 2 issues:
- Since, you are using a DateTimeIndex dataframe, the correct notation to traverse between rows and columns are:
a) data.loc[rows_index_name, [column__index_name]]
or
b) data.loc[rows_index_name, column__index_name]
depending on the type of output you desire.
Notation A will return a series value, while notation (b) returns a string value.
- The index names can not be amputated- you must specify the whole string.
As such, your issue will be resolved with:
data.loc['2018-01-31',['2018-01-31']] or data.loc['2018-01-31','2018-01-31']
As long as you already set the date as index, you will not be able to slice or extract any data of it. You can extract the month and date of it as it is a regular column not when it is an index. I had this before and that was the solution.
I kept it as a regular column, extracted the Month, Day and Year as a seperate column for each of them, then I assigned the date column as the index column.
you are accessing as a period (YYYY-MM) on a date columns.
This would help in this case
data.columns = pd.period_range(start="2018-01", end="2018-02", freq='M')
data[['2018-01']]
2018-01
PERIOD_END_DATE
2018-01-31 a
2018-02-28 b
2018-03-31 c