pandas DataFrame, how to apply function to a specific column?

Question:

I have read the docs of DataFrame.apply

DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)¶
Applies function along input axis of DataFrame.

So, How can I apply a function to a specific column?

In [1]: import pandas as pd
In [2]: data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
In [3]: df = pd.DataFrame(data)
In [4]: df
Out[4]: 
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9
In [5]: def addOne(v):
...:        v += 1
...:        return v
...: 
In [6]: df.apply(addOne, axis=1)
Out[6]: 
   A  B   C
0  2  5   8
1  3  6   9
2  4  7  10

I want to addOne to every value in df['A'], not all columns. How can I do that with DataFrame.apply.

Thanks for help!

Asked By: GoingMyWay

||

Answers:

The answer is,

df['A'] = df['A'].map(addOne)

and maybe you would be better to know about the difference of map, applymap, apply.

but if you insist to use apply, you could try like below.

def addOne(v):
    v['A'] += 1
    return v

df.apply(addOne, axis=1)
Answered By: su79eu7k

One simple way would be:

df['A'] = df['A'].apply(lambda x: x+1)
Answered By: felix_as

you can use .apply() with lambda function to solve this kind of problems.

Consider, your dataframe is something like this,

A | B | C
----------
1 | 4 | 7
2 | 5 | 8
3 | 6 | 9

The function which you want to apply:

def addOne(v):
v += 1
return v

So if you write your code like this,

df['A'] = df.apply(lambda x: addOne(x.A), axis=1)

You will get:

A | B | C
----------
2 | 4 | 7
3 | 5 | 8
4 | 6 | 9
Answered By: Tejas Shah

For anyone else looking for a solution that allows for pipe-ing:

identity = lambda x: x

def transform_columns(df, mapper):
    return df.transform(
        {
            **{
                column: identity
                for column in df.columns
            },
            **mapper
        }
    )

# you can monkey-patch it on the pandas DataFrame (but don't have to, see below)
pd.DataFrame.transform_columns = transform_columns

(
    pd.DataFrame(data)
    .rename(columns={'A': 'A1'})   # just to demonstrate the motivation
    .transform_columns({'A1': add_one})
)

This also allows to:

pd.DataFrame(data).transform_columns({
    'A': add_one,
    'B': add_two,
})

And if you do not want to monkey-patch DataFrame, you can always use it with pipe:

pd.DataFrame(data).pipe(transform_columns, {'A': add_one})

It would be great if this was naively supported by pandas though.

The snippets above are CC0.

Answered By: krassowski
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.