Pandas resample function reports TypeError: Only valid with DatetimeIndex

Pandas resample function reports TypeError: Only valid with DatetimeIndex

1. Phenomenon

Read data from the MySQL database into DataFrame, use the resample function, and report an error:

week_df = df.resample("W").first()

Sample error by week:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of ‘Index’

Check data type: <class 'datetime.date'>

print(type(df.index[0])) 
<class 'datetime.date'>

Made a test program, the resample function works normally.

import pandas as pd
import numpy as np
dayseries = pd.date_range('1/1/2022',periods=30,freq='D')
ts1 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
ts2 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
ts3 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
ts4 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)
ts5 = pd.Series(np.random.randn(len(dayseries)),index=dayseries)


df = pd.DataFrame({'open':ts1,'high':ts2,'low':ts3,'close':ts4, 'volume':ts5} ,index = dayseries)
df.index.name='dayseries'

#用于产生聚合值的函数名或数组函数,例如‘mean’、‘ohlc’、np.max等,默认是‘mean’,
#其他常用的值由:‘first’、‘last’、‘median’、‘max’、‘min’

df_week = pd.DataFrame({'open':df['open'].resample('W').first(),
                        'close':df['close'].resample('W').last(),
                        'high':df['high'].resample('W').max(),
                        'low':df['low'].resample('W').min(),
                        'volume':df['volume'].resample('W').sum()})


#df['open'].resample('W').first()
print(type(df_week.index[0])) # <class 'pandas._libs.tslibs.timestamps.Timestamp'>
df_week

Must be sampled by an index, the index data type is a timestamp:

print(type(df_week.index[0])) 
 <class 'pandas._libs.tslibs.timestamps.Timestamp'>

2. Reason

Reason: Through the test, it can be seen that the resample must use the data type of the timestamp, otherwise it will prompt that the data type is incorrect.

Solution: Just convert the data type of the index directly.
At first, I wanted to adjust the data type of the database field to match the timestamp of the dataframe, so I took a detour.

df.index = pd.to_datetime(df.index)
print(type(df.index[0]))

The data type of the index is, <class 'pandas._libs.tslibs.timestamps.Timestamp'>

Guess you like

Origin blog.csdn.net/qq_39065491/article/details/130870196