How two create label column, based on index number (odd/even) on pySpark

Question:

Here’s my Input

    index   date_id     year    month   day hour    minute
0   156454  20200801    2021    12       31    12       38
1   156454  20200801    2021    12       31    12       39

What I want is just make label ‘poi1’ for odd rows and ‘poi2’ for even rows

Here’s my output

    index   date_id     year    month   day hour    minute  label
0   156454  20200801    2021    12       31    12       38  poi1
1   156454  20200801    2021    12       31    12       39  poi2

The pandas code is like this

df_movmnt_2["label"] = np.where(((df_movmnt_2.index)+1)%2 != 0, "poi1", "poi2")
Asked By: Nabih Bawazir

||

Answers:

Use when().otherwise()

   df.withColumn('label', when((col('index')+1)%2==0,'poi1').otherwise('poi2')).show()

+-----+-------+--------+-----+---+----+------+---+-----+
|index|date_id|    year|month|day|hour|minute| _8|label|
+-----+-------+--------+-----+---+----+------+---+-----+
|    0| 156454|20200801| 2021| 12|  31|    12| 38| poi2|
|    1| 156454|20200801| 2021| 12|  31|    12| 39| poi1|
+-----+-------+--------+-----+---+----+------+---+-----+
Answered By: wwnde
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.