How to turn pandas df into 2D form
Question:
I have a following dataset:
import pandas as pd
input = {"Product" : ["Car", "", "", "House", "", "", ""], "Name" : ["Wheel", "Glass", "Seat", "Glass", "Roof", "Door", "Kitchen"],
"Price" : [5, 3, 4, 2, 6, 4, 12]}
df_input = pd.DataFrame(input)
I would like to turn this df into 2D form. How can I do this please?
Desired output is:
output = {"Product" : [ "Car", "House"] , "Wheel" : [ 5, 0], "Glass" : [ 3, 2],"Seat" : [ 4, 0],"Roof" : [ 0, 6], "Door" : [ 0, 4], "Kitchen" : [ 0, 12]}
df_output = pd.DataFrame(output)
Answers:
This is a variation on a pivot
, you first need to pre-process the "Product" column:
(df_input
.assign(Product=df_input['Product'].where(df_input['Product'].ne('')).ffill())
.pivot_table(index='Product', columns='Name', values='Price', fill_value=0)
.reset_index().rename_axis(columns=None)
)
Output:
Product Door Glass Kitchen Roof Seat Wheel
0 Car 0 3 0 0 4 5
1 House 4 2 12 6 0 0
To turn the input dataframe ‘df_input‘ into the desired 2D output format, you can use pandas’ ‘pivot_table‘ method. Here’s how you can do it:
import pandas as pd
input = {"Product" : ["Car", "", "", "House", "", "", ""], "Name" : ["Wheel", "Glass", "Seat", "Glass", "Roof", "Door", "Kitchen"],
"Price" : [5, 3, 4, 2, 6, 4, 12]}
df_input = pd.DataFrame(input)
# Pivot the dataframe
df_output = df_input.pivot_table(index="Name", columns="Product", values="Price", fill_value=0)
# Reset the index to make the "Name" column a regular column
df_output = df_output.reset_index()
# Rename the columns to match the desired output format
df_output.columns = ["Name"] + input["Product"]
# Convert the dataframe to a dictionary
output = df_output.to_dict(orient="list")
# Print the output
print(output)
This will output the following dictionary:
{'Name': ['Door', 'Glass', 'Kitchen', 'Roof', 'Seat', 'Wheel'], 'Car': [0, 3, 0, 0, 4, 5], 'House': [4, 2, 12, 6, 0, 0]}
You can then convert this dictionary to a dataframe if necessary using ‘pd.DataFrame(output)‘.
Note that the order of the columns may be different from the desired output format, but the data should be the same.
I have a following dataset:
import pandas as pd
input = {"Product" : ["Car", "", "", "House", "", "", ""], "Name" : ["Wheel", "Glass", "Seat", "Glass", "Roof", "Door", "Kitchen"],
"Price" : [5, 3, 4, 2, 6, 4, 12]}
df_input = pd.DataFrame(input)
I would like to turn this df into 2D form. How can I do this please?
Desired output is:
output = {"Product" : [ "Car", "House"] , "Wheel" : [ 5, 0], "Glass" : [ 3, 2],"Seat" : [ 4, 0],"Roof" : [ 0, 6], "Door" : [ 0, 4], "Kitchen" : [ 0, 12]}
df_output = pd.DataFrame(output)
This is a variation on a pivot
, you first need to pre-process the "Product" column:
(df_input
.assign(Product=df_input['Product'].where(df_input['Product'].ne('')).ffill())
.pivot_table(index='Product', columns='Name', values='Price', fill_value=0)
.reset_index().rename_axis(columns=None)
)
Output:
Product Door Glass Kitchen Roof Seat Wheel
0 Car 0 3 0 0 4 5
1 House 4 2 12 6 0 0
To turn the input dataframe ‘df_input‘ into the desired 2D output format, you can use pandas’ ‘pivot_table‘ method. Here’s how you can do it:
import pandas as pd
input = {"Product" : ["Car", "", "", "House", "", "", ""], "Name" : ["Wheel", "Glass", "Seat", "Glass", "Roof", "Door", "Kitchen"],
"Price" : [5, 3, 4, 2, 6, 4, 12]}
df_input = pd.DataFrame(input)
# Pivot the dataframe
df_output = df_input.pivot_table(index="Name", columns="Product", values="Price", fill_value=0)
# Reset the index to make the "Name" column a regular column
df_output = df_output.reset_index()
# Rename the columns to match the desired output format
df_output.columns = ["Name"] + input["Product"]
# Convert the dataframe to a dictionary
output = df_output.to_dict(orient="list")
# Print the output
print(output)
This will output the following dictionary:
{'Name': ['Door', 'Glass', 'Kitchen', 'Roof', 'Seat', 'Wheel'], 'Car': [0, 3, 0, 0, 4, 5], 'House': [4, 2, 12, 6, 0, 0]}
You can then convert this dictionary to a dataframe if necessary using ‘pd.DataFrame(output)‘.
Note that the order of the columns may be different from the desired output format, but the data should be the same.