Timeseries pandas changing name columns IndexError: too many indices for array
Question:
IMAGE OF THE DATASET I had dates in one column, set them as a time series, and then transposed them so that each year has its own column. Some columns have the correct year, but the wrong date. I would like to set them all to the same month and day, but when I try to modify them, I get an error "IndexError: too many indices for array".
I tried these 2 methods:
df_top100.rename(columns={'2012-12-31 00:00:00': '2012-01-01 00:00:00'}, inplace=True)
#df_top100.rename(columns={'2013-12-31': '2013-01-01'}, inplace=True)
#df_top100.rename(columns={'2016-12-31': '2016-01-01'}, inplace=True)
Dataset, first line:
year 2005-01-01 2006-01-01 2007-01-01 2008-01-01 2009-01-01 2010-01-01 2011-01-01 2012-12-31 2013-12-31 2014-01-01 2015-01-01 2016-12-31
country
Australia 2.0 3.0 3.0 3.0 3.0 3.0 9.0 6.0 8.0 11.0 11.0 6.0
Last column:
print print(df_top100.columns[-1])
2016-12-31 00:00:00
Answers:
Solution for DatetimeIndex
in columns – extract years and convert to datetimes:
d = pd.to_datetime(['2012-12-31 00:00:00','2013-12-31','2016-12-31'])
df_top100 = pd.DataFrame(columns=d)
df_top100.columns = pd.to_datetime(df_top100.columns.year, format='%Y')
print (df_top100)
Empty DataFrame
Columns: [2012-01-01 00:00:00, 2013-01-01 00:00:00, 2016-01-01 00:00:00]
Index: []
For strings use DatetimeIndex.strftime
:
df_top100.columns = df_top100.columns.strftime('%Y-01-01')
print (df_top100)
Empty DataFrame
Columns: [2012-01-01, 2013-01-01, 2016-01-01]
Index: []
If you provide an small example that we can copy, it is much easier to help.
looking at your code, I guess it does not work since you are trying to reference a datetime object with a string. Try this:
from datetime import datetime
df_top100.rename(columns={datetime(2012, 12, 31): '2012-01-01 00:00:00'}, inplace=True)
IMAGE OF THE DATASET I had dates in one column, set them as a time series, and then transposed them so that each year has its own column. Some columns have the correct year, but the wrong date. I would like to set them all to the same month and day, but when I try to modify them, I get an error "IndexError: too many indices for array".
I tried these 2 methods:
df_top100.rename(columns={'2012-12-31 00:00:00': '2012-01-01 00:00:00'}, inplace=True)
#df_top100.rename(columns={'2013-12-31': '2013-01-01'}, inplace=True)
#df_top100.rename(columns={'2016-12-31': '2016-01-01'}, inplace=True)
Dataset, first line:
year 2005-01-01 2006-01-01 2007-01-01 2008-01-01 2009-01-01 2010-01-01 2011-01-01 2012-12-31 2013-12-31 2014-01-01 2015-01-01 2016-12-31
country
Australia 2.0 3.0 3.0 3.0 3.0 3.0 9.0 6.0 8.0 11.0 11.0 6.0
Last column:
print print(df_top100.columns[-1])
2016-12-31 00:00:00
Solution for DatetimeIndex
in columns – extract years and convert to datetimes:
d = pd.to_datetime(['2012-12-31 00:00:00','2013-12-31','2016-12-31'])
df_top100 = pd.DataFrame(columns=d)
df_top100.columns = pd.to_datetime(df_top100.columns.year, format='%Y')
print (df_top100)
Empty DataFrame
Columns: [2012-01-01 00:00:00, 2013-01-01 00:00:00, 2016-01-01 00:00:00]
Index: []
For strings use DatetimeIndex.strftime
:
df_top100.columns = df_top100.columns.strftime('%Y-01-01')
print (df_top100)
Empty DataFrame
Columns: [2012-01-01, 2013-01-01, 2016-01-01]
Index: []
If you provide an small example that we can copy, it is much easier to help.
looking at your code, I guess it does not work since you are trying to reference a datetime object with a string. Try this:
from datetime import datetime
df_top100.rename(columns={datetime(2012, 12, 31): '2012-01-01 00:00:00'}, inplace=True)