How to get the sum of elements in two different lists in a DataFrame instead of concatenation in Python?

Question:

I have a DataFrame that contains two columns, ‘A_List’ and ‘B_List’, which are of the string dtype. I have converted these to lists and I would like to now perform element wise addition of the elements in the lists at specific indices. I have attached an example of the csv file I’m using. When I do the following, I am getting an output that is joining the elements at the specified indices as opposed to finding their sum. What may I try differently to achieve the sum instead?

enter image description here

For example, when I do row["A_List"][0] + row["B_List"][3], the desired output would be 0.16 (since 0.1+0.06 = 0.16). Instead, I am getting 0.10.06 as my answer.

import pandas as pd

df = pd.read_csv('Example.csv')

# Get  rid of the brackets []
df["A_List"] = df["A_List"].apply(lambda x: x.strip("[]"))
df["B_List"] = df["B_List"].apply(lambda x: x.strip("[]"))

# Convert the string dtype of values into a list
df["A_List"] = df["A_List"].apply(lambda x: x.split())
df["B_List"] = df["B_List"].apply(lambda x: x.split())

for i, row in df.iterrows():
    print(row["A_List"][0] + row["B_List"][3])
Asked By: curd_C

||

Answers:

The problem is that when you’re using the + operator, Python is interpreting it as a concatenation of strings, not as an addition of numeric values. In order to add the numeric values, you will need to convert the elements of the lists from strings to floats before performing the addition. You can do this by using the map() function along with the float() constructor. Here’s an updated version of your code:

import pandas as pd

df = pd.read_csv('Example.csv')

# Get rid of the brackets []
df["A_List"] = df["A_List"].apply(lambda x: x.strip("[]"))
df["B_List"] = df["B_List"].apply(lambda x: x.strip("[]"))

# Convert the string dtype of values into a list
df["A_List"] = df["A_List"].apply(lambda x: x.split())
df["B_List"] = df["B_List"].apply(lambda x: x.split())

# Convert the elements of the lists to floats
df["A_List"] = df["A_List"].apply(lambda x: list(map(float, x)))
df["B_List"] = df["B_List"].apply(lambda x: list(map(float, x)))

for i, row in df.iterrows():
    print(row["A_List"][0] + row["B_List"][3])

This will convert the elements of the lists from strings to floats before performing the addition, giving you the desired output.

Alternatively, you can use the pd.to_numeric(s, downcast=’float’) function to change the string values to float in a more direct way.

import pandas as pd

df = pd.read_csv('Example.csv')
df[['A_List', 'B_List']] = df[['A_List', 'B_List']].applymap(lambda x: pd.to_numeric(x.strip("[]").split(), downcast='float'))

This will apply the conversion in one line for both columns A_List and B_List

Answered By: user3583127