How to use map to process column?

Question:

I have a pandas dataframe like this,

import pandas as pd

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)


   calories  duration
0       420        50
1       380        40
2       390        45

And I have a function to alter the value,

def alter_val(val):
    return val + 1

Now, as the documentation says, map() takes a function and iterator, it return another iterator. In my understanding, then it should work like this,

df["new_value"] = map(alter_val, df["calories"])

But it doesn’t work. Shows

TypeError: object of type 'map' has no len()

However, it works if I use the following code,

df["new"] = df["calories"].map(add_cal)

But it does not follow for documented approach map(function, series)

Can someone please take some time to explain the correct way, and why is it so?

Asked By: Droid-Bird

||

Answers:

Convert the map output to list

df["new_value"] = list(map(alter_val, df["calories"]))
Answered By: Wasi Haider

map returns an iterator that yields results, not returns results, which means it’s results are not actually calculated until you explicitly "ask" for them. Try this:

list(map(alter_val, df["calories"]))

When you convert an iterator to a list, it has to calculate all of the results and store them in memory.

Despite that, I would stick to pandas .map() method, as it appears to be cleaner in my opinion

Answered By: sevs
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.