Creating a dataframe from different lists

Question:

I am new to python, so the question could be trivial.

I have a pair of lists, containing solid names and the associated counts, of which I am providing a sample here below:

volumes1 = ['Shield', 'Side', 'expHall', 'Funnel', 'gridpiece']

counts1= [3911, 1479, 553, 368, 342]

and a second pair of lists

volumes2 = ['Shield', 'leg', 'Funnel', 'gridpiece','wafer']

counts2= [291, 469, 73, 28, 32]

Notice that not all the volumes are present in each list, and their position can be different.

What I would like to obtain is a dataframe where the first column comprehends all the volumes in volume1 and volume2, the second columns is all the corresponding values in counts1, and the third column is all the corresponding values in counts2.

If a volume in the first column of the dataframe is not present in volume1 the corresponding value in the second column is set to 0, and in the same way if a volume in the first column of the dataframe is not present in volume2 the corresponding value in the third column is set to 0, so that the final output for the values I provided would be:

| volumes | counts1 | counts2 |

| Shield | 3911 | 291 |

| Side | 1479 | 0 |

| expHall | 553 | 0 |

| Funnel | 368 | 73 |

| gridpiece | 342 | 28 |

| leg | 0 | 469 |

| wafer | 0 | 32 |

I am not so experienced in python and I have been struggling a lot with no results, is there any way to obtain what I want in a quick and elegant way?

Thanks

Asked By: saimon

||

Answers:

quess not optimal, but one solution

import pandas as pd

volumes1 = ['Shield', 'Side', 'expHall', 'Funnel', 'gridpiece']
counts1= [3911, 1479, 553, 368, 342]
volumes2 = ['Shield', 'leg', 'Funnel', 'gridpiece','wafer']
counts2= [291, 469, 73, 28, 32]

volumes12=list(set(volumes1+volumes2))
counts1R=[0]*len(volumes12)
counts2R=[0]*len(volumes12)

for x in range(0,len(volumes1)):
    p=list(volumes12).index(volumes1[x])
    counts1R[p]=counts1[x]
for x in range(0,len(volumes2)):
    p=list(volumes12).index(volumes2[x])
    counts2R[p]=counts2[x]

d={   'volumes':volumes12,
      'counts1':counts1R,
      'counts2':counts2R
    }
df = pd.DataFrame(data=d)
print(df)


see https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html

Answered By: user13322060
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.