convert columns of pyspark data frame to lowercase

Question

I have a dataframe in pyspark which has columns in uppercase like ID, COMPANY and so on

I want to make these column names to id company and so on. Bacially convert all the columns to lowercase or uppercase depending on the requirement.

I want to do in such away that the data types of the columns remain the same.

How can we do that?

Asked By: user7543621

||

Source

Answer 1

Use columns field from DataFrame

df = // load
for col in df.columns:
    df = df.withColumnRenamed(col, col.lower())

Or, as @zero323 suggested:

df.toDF(*[c.lower() for c in df.columns])

Answered By: T. Gawęda

Answer 2

Could also use select with alias (be sure pyspark.sql.functions are imported as "f"):

df.select([f.col(col).alias(col.upper()) for col in df.columns])

Answered By: gndumbri

Answer 3

Please try following code, here df is your pypsark dataframe (In this case I have created my dataframe by reading from table)

df = spark.sql("select * from <your table name >")
new_column_name_list= list(map(lambda x: x.lower(), df.columns))
df = df.toDF(*new_column_name_list)
display(df)

Answered By: Sandy

convert columns of pyspark data frame to lowercase

Question:

Answers: