Change Column Names using Dictionary (key value pair) in Databricks

Question:

I am new to Databricks and python, I just want to know the best way to change the column names in Databricks. For example if the column name is ‘ID’ then I want to change that to Patient_ID ,’Name’ to ‘Patient_Name’.. So I thought I will use dictionaries but i don’t know how to apply that as col names.
Please help, thanks in advance.

Note: the position of col names can change so thought of using dictionary.

Dictionary = {<ID> : <Patient_ID>, <Name> : <Patient_Name>,<Age> : <Patient_age>}

Example of what I am trying to achieve(picture attached)

I tried using a json file to do this but i ended up no wr

Asked By: Darkmaster

||

Answers:

Given the following dataset

columns=["ID","Name","Age","Country"]
data = [(1,"John","42","Spain"),(2,"Jane","24","Norway"),(3,"Nohj","38","Iceland"),(4,"Fabrice","65","France")]
df=spark.createDataFrame(data,columns)
df.show()

+---+-------+---+-------+
| ID|   Name|Age|Country|
+---+-------+---+-------+
|  1|   John| 42|  Spain|
|  2|   Jane| 24| Norway|
|  3|   Nohj| 38|Iceland|
|  4|Fabrice| 65| France|
+---+-------+---+-------+

You could loop on your dictionary as follows :


dictionary = {"ID": "Patient_ID", "Name": "Patient_Name", "Age": "Patient_Age"}
for column in dictionary.keys() :
  df = df.withColumnRenamed(column,dictionary[column])
  
df.show()

+----------+-----------+-----------+-------+
|Patient_ID|Patient_Name|Patient_Age|Country|
+----------+-----------+-----------+-------+
|         1|       John|         42|  Spain|
|         2|       Jane|         24| Norway|
|         3|       Nohj|         38|Iceland|
|         4|    Fabrice|         65| France|
+----------+-----------+-----------+-------+
Answered By: Axel R.
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.