AssertionError: col should be Column

Question:

How to create a new column in PySpark and fill this column with the date of today?

This is what I tried:

import datetime
now = datetime.datetime.now()
df = df.withColumn("date", str(now)[:10])

I get this error:

AssertionError: col should be Column

Asked By: Markus

||

Answers:

How to create a new column in PySpark and fill this column with the date of today?

There is already function for that:

from pyspark.sql.functions import current_date

df.withColumn("date", current_date().cast("string"))

AssertionError: col should be Column

Use literal

from pyspark.sql.functions import lit

df.withColumn("date", lit(str(now)[:10]))
Answered By: Alper t. Turker
import datetime
from pyspark.sql.functions import col,lit
users_list = [(1, 'Scott'), (2, 'Donald'), (3, 'Mickey'), (4, 'Elvis')]
now = datetime.datetime.now()
df = spark.createDataFrame(users_list, 'user_id int, user_first_name string')
df.select("user_id", "user_first_name").withColumn("date", lit(datetime.datetime.now())).show()
Answered By: Azmat Siddique