Pandas resample function reports TypeError: Only valid with DatetimeIndex
1. Phenomenon
Read data from the MySQL database into DataFrame, use the resample function, and report an error:
week_df = df.resample("W").first()
Sample error by week:
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of ‘Index’
Check data type: <class 'datetime.date'>
print(type(df.index[0]))
<class 'datetime.date'>
Made a test program, the resample function works normally.
import pandas as pd
import numpy as np
dayseries = pd.date_range('1/1/2022',periods=30,freq='D')
ts1 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
ts2 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
ts3 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
ts4 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
ts5 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
df = pd.DataFrame({'open':ts1,'high':ts2,'low':ts3,'close':ts4, 'volume':ts5} ,index = dayseries)
df.index.name='dayseries'
#用于产生聚合值的函数名或数组函数,例如‘mean’、‘ohlc’、np.max等,默认是‘mean’,
#其他常用的值由:‘first’、‘last’、‘median’、‘max’、‘min’
df_week = pd.DataFrame({'open':df['open'].resample('W').first(),
'close':df['close'].resample('W').last(),
'high':df['high'].resample('W').max(),
'low':df['low'].resample('W').min(),
'volume':df['volume'].resample('W').sum()})
#df['open'].resample('W').first()
print(type(df_week.index[0])) # <class 'pandas._libs.tslibs.timestamps.Timestamp'>
df_week
Must be sampled by an index, the index data type is a timestamp:
print(type(df_week.index[0]))
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
2. Reason
Reason: Through the test, it can be seen that the resample must use the data type of the timestamp, otherwise it will prompt that the data type is incorrect.
Solution: Just convert the data type of the index directly.
At first, I wanted to adjust the data type of the database field to match the timestamp of the dataframe, so I took a detour.
df.index = pd.to_datetime(df.index)
print(type(df.index[0]))
The data type of the index is, <class 'pandas._libs.tslibs.timestamps.Timestamp'>