Pandas Dataframes: comparing values of two adjacent rows and adding a column

Question:

I have a pandas Dataframe, where I have to compare values of two adjacent rows of a particular column and if they are equal then in a new column 0 needs to be added in the corresponding first row or 1 if the value in the second row is greater than the first or -1 if it’s smaller. For example, such an operation on the following Dataframe
dataframe before the operation

   column1
0        2
1        2
2        4
3        4
4        5
5        3
6        2
7        1
8       55
9        3

should give the following output

dataframe after the operation

   column1  column2 
0        2        0
1        2        1
2        4        0
3        4        1
4        5       -1
5        3       -1
6        2       -1
7        1        1
8       55       -1
9        3        0
Asked By: user1851006

||

Answers:

What we are looking for is the sign of the change. We break this up into 3 steps:

  1. diff will take the differences of each row with the prior row This captures the change.
  2. x / abs(x) is common way to capture the sign of something. We use it here when we divide d by d.abs().
  3. finally, we have a residual nan in the first position due to diff and when we divide by zero. We can fill them in with zero.

df = pd.DataFrame(dict(column1=[2, 2, 4, 4, 5, 3, 2, 1, 55, 3]))
d = df.column1.diff()
d.div(d.abs()).fillna(0)

0    0.0
1    0.0
2    1.0
3    0.0
4    1.0
5   -1.0
6   -1.0
7   -1.0
8    1.0
9   -1.0
Name: column1, dtype: float64
Answered By: piRSquared

You can use Series.diff() and np.sign() methods:

In [27]: df['column2'] = np.sign(df.column1.diff().fillna(0))

In [28]: df
Out[28]:
   column1  column2
0        2      0.0
1        2      0.0
2        4      1.0
3        4      0.0
4        5      1.0
5        3     -1.0
6        2     -1.0
7        1     -1.0
8       55      1.0
9        3     -1.0

but in order to get your desired DF (which contradicts your description), you can do the following:

In [30]: df['column3'] = np.sign(df.column1.diff().fillna(0)).shift(-1).fillna(0)

In [31]: df
Out[31]:
   column1  column2  column3
0        2      0.0      0.0
1        2      0.0      1.0
2        4      1.0      0.0
3        4      0.0      1.0
4        5      1.0     -1.0
5        3     -1.0     -1.0
6        2     -1.0     -1.0
7        1     -1.0      1.0
8       55      1.0     -1.0
9        3     -1.0      0.0
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.