When clause in pyspark gives an error "name 'when' is not defined"
Question:
With the below code I am getting an error message, name ‘when’ is not defined.
voter_df = voter_df.withColumn('random_val',
when(voter_df.TITLE == 'Councilmember', F.rand())
.when(voter_df.TITLE == 'Mayor', 2)
.otherwise(0))
Add a column to voter_df named random_val with the results of the F.rand() method for any voter with the title Councilmember. Set random_val to 2 for the Mayor. Set any other title to the value 0
Answers:
The second when statement is a method of dataframe, but the first when statement is not.
Solution:
use ....'random_val',F.when(....
use
from pyspark.sql.functions import when
With the below code I am getting an error message, name ‘when’ is not defined.
voter_df = voter_df.withColumn('random_val',
when(voter_df.TITLE == 'Councilmember', F.rand())
.when(voter_df.TITLE == 'Mayor', 2)
.otherwise(0))
Add a column to voter_df named random_val with the results of the F.rand() method for any voter with the title Councilmember. Set random_val to 2 for the Mayor. Set any other title to the value 0
The second when statement is a method of dataframe, but the first when statement is not.
Solution:
use ....'random_val',F.when(....
use
from pyspark.sql.functions import when