Prachi :
I have a pandas dataframe
with sorting on column 'DT'
like this:
S DT
100 2000-12-12 05:00:00
100 2000-12-12 05:00:50
89 2000-12-12 05:01:20
89 2000-12-12 05:02:00
89 2000-12-12 05:02:35
98 2000-12-12 05:03:15
98 2000-12-12 05:03:50
98 2000-12-12 05:04:28
98 2000-12-12 05:05:05
112 2000-12-12 05:05:47
112 2000-12-12 05:06:15
112 2000-12-12 05:07:00
How can I find the previous of any given value in column 'S'
from this data?
Like for S = 112, its previous should give 98, for 98 it should give prev = 89 and so on. I would like to store the previous values for any given 'S' in a separate variable which I can later access in my code. Any help will be deeply appreciated as I am new to the world of coding.
jezrael :
Idea is use Series.shift
, replace all rows without last of consecutive groups by Series.where
to missing values and last forward filling mising values.
Solution also working if multiple groups with same values, like changed 89
to 112
.
df['prev'] = df['S'].shift().where(df['S'].ne(df['S'].shift())).ffill()
print (df)
S DT prev
0 100 2000-12-12 05:00:00 NaN
1 100 2000-12-12 05:00:50 NaN
2 89 2000-12-12 05:01:20 100.0
3 89 2000-12-12 05:02:00 100.0
4 89 2000-12-12 05:02:35 100.0
5 98 2000-12-12 05:03:15 89.0
6 98 2000-12-12 05:03:50 89.0
7 98 2000-12-12 05:04:28 89.0
8 98 2000-12-12 05:05:05 89.0
9 112 2000-12-12 05:05:47 98.0
10 112 2000-12-12 05:06:15 98.0
11 112 2000-12-12 05:07:00 98.0
If need only previous values in new DataFrame
:
df1 = df.assign(prev=df['S'].shift()).loc[df['S'].ne(df['S'].shift()), ['S','prev']]
print (df1)
S prev
0 100 NaN
2 89 100.0
5 98 89.0
9 112 98.0