愉快的学习就从翻译开始吧_How to Convert a Time Series to a Supervised Learning Problem in Python_1

Pandas shift() Function/Pandas shift()函数

A key function to help transform time series data into a supervised learning problem is the Pandas shift() function.

帮助将时间序列数据转换为监督学习问题的关键功能是Pandas shift()函数。

Given a DataFrame, the shift() function can be used to create copies of columns that are pushed forward (rows of NaN values added to the front) or pulled back (rows of NaN values added to the end).

给定一个DataFrame,可以使用shift()函数创建前推(NaN值的行添加到前面)或回拉(添加到最后的NaN值的行)的列的副本

This is the behavior required to create columns of lag observations as well as columns of forecast observations for a time series dataset in a supervised learning format.

这是为在监督学习中的一个时间序列创建滞后观测列和预测观测列的行为

Let’s look at some examples of the shift function in action.

我们来看看一些实际的移位功能。

We can define a mock time series dataset as a sequence of 10 numbers, in this case a single column in a DataFrame as follows:

我们可以将一个模拟时间序列数据集定义为一个由10个数字组成的序列,在这种情况下,DataFrame中的单个列如下所示:

Running the example prints the time series data with the row indices for each observation.

运行该示例将为每个观察值打印具有行索引的时间序列数据。

We can shift all the observations down by one time step by inserting one new row at the top. Because the new row has no data, we can use NaN to represent “no data”.

我们可以通过在顶部插入一个新行来将所有观察结果向下移动一个时间步。 由于新行没有数据,我们可以使用NaN来表示“无数据”。

The shift function can do this for us and we can insert this shifted column next to our original series.

shift函数可以为我们做这些,并且我们可以插入这个位移的列到我们的原始序列旁边

Running the example gives us two columns in the dataset. The first with the original observations and a new shifted column.

运行该示例会为我们提供数据集中的两列。 第一个原始观测和一个新的位移列。

We can see that shifting the series forward one time step gives us a primitive supervised learning problem, although with X and y in the wrong order. Ignore the column of row labels. The first row would have to be discarded because of the NaN value. The second row shows the input value of 0.0 in the second column (input or X) and the value of 1 in the first column (output or y).

我们可以看到,将序列向前移动一个时间步给了我们一个原始的监督学习问题,尽管X和y的顺序是错误的。 忽略行标签的列。 由于NaN值,第一行必须被丢弃。 第二行显示第二列(输入或X)中的输入值0.0和第一列(输出或y)中的值1。

We can see that if we can repeat this process with shifts of 2, 3, and more, how we could create long input sequences (X) that can be used to forecast an output value (y).

我们可以看到,如果我们可以通过2,3和更多的移位重复这个过程,我们如何创建可用于预测输出值(y)的长输入序列(X)。

The shift operator can also accept a negative integer value. This has the effect of pulling the observations up by inserting new rows at the end. Below is an example:

移位运算符也可以接受一个负整数值。 这具有通过在最后插入新行来提升观察值的效果。 下面是一个例子:

Running the example shows a new column with a NaN value as the last value.

运行该示例将显示一个NaN值作为最后一个值的新列。

We can see that the forecast column can be taken as an input (X) and the second as an output value (y). That is the input value of 0 can be used to forecast the output value of 1.

我们可以看到第一列可以作为输入(X),第二列可以作为输出值(y)。 即输入值0可以用来预测1的输出值。(肯定是打错了,first打成了forcecast)

Technically, in time series forecasting terminology the current time (t) and future times (t+1t+n) are forecast times and past observations (t-1t-n) are used to make forecasts.

在技术上,在时间序列预测术语中,当前时间(t)和未来时间(t + 1,t + n)是预测时间,过去的观测值(t-1,t-n)用于预测。

We can see how positive and negative shifts can be used to create a new DataFrame from a time series with sequences of input and output patterns for a supervised learning problem.

我们可以看到正向和负向的变化,可以用来为监督学习问题创建一个从时间序列转换来的带输入和输出对的新的DataFrame。

This permits not only classical X -> y prediction, but also X -> Y where both input and output can be sequences.

这不仅允许经典的X - > y预测,而且允许X - > Y,其中输入和输出都可以是序列。

Further, the shift function also works on so-called multivariate time series problems. That is where instead of having one set of observations for a time series, we have multiple (e.g. temperature and pressure). All variates in the time series can be shifted forward or backward to create multivariate input and output sequences. We will explore this more later in the tutorial.

此外,移位函数也适用于所谓的多变量时间序列问题。 这是一个时间序列,而不是一组观察值,我们有多个(例如温度和压力)。 时间序列中的所有变量可以向前或向后移动以创建多变量输入和输出序列。 我们将在后面的教程中进一步探讨。

猜你喜欢

转载自blog.csdn.net/dreamscape9999/article/details/80862255
今日推荐