convert columns of pyspark data frame to lowercase
Question:
I have a dataframe in pyspark which has columns in uppercase like ID
, COMPANY
and so on
I want to make these column names to id
company
and so on. Bacially convert all the columns to lowercase or uppercase depending on the requirement.
I want to do in such away that the data types of the columns remain the same.
How can we do that?
Answers:
Use columns
field from DataFrame
df = // load
for col in df.columns:
df = df.withColumnRenamed(col, col.lower())
Or, as @zero323 suggested:
df.toDF(*[c.lower() for c in df.columns])
Could also use select with alias (be sure pyspark.sql.functions are imported as "f"):
df.select([f.col(col).alias(col.upper()) for col in df.columns])
Please try following code, here df is your pypsark dataframe (In this case I have created my dataframe by reading from table)
df = spark.sql("select * from <your table name >")
new_column_name_list= list(map(lambda x: x.lower(), df.columns))
df = df.toDF(*new_column_name_list)
display(df)
I have a dataframe in pyspark which has columns in uppercase like ID
, COMPANY
and so on
I want to make these column names to id
company
and so on. Bacially convert all the columns to lowercase or uppercase depending on the requirement.
I want to do in such away that the data types of the columns remain the same.
How can we do that?
Use columns
field from DataFrame
df = // load
for col in df.columns:
df = df.withColumnRenamed(col, col.lower())
Or, as @zero323 suggested:
df.toDF(*[c.lower() for c in df.columns])
Could also use select with alias (be sure pyspark.sql.functions are imported as "f"):
df.select([f.col(col).alias(col.upper()) for col in df.columns])
Please try following code, here df is your pypsark dataframe (In this case I have created my dataframe by reading from table)
df = spark.sql("select * from <your table name >")
new_column_name_list= list(map(lambda x: x.lower(), df.columns))
df = df.toDF(*new_column_name_list)
display(df)