pandas DataFrame, how to apply function to a specific column?
Question:
I have read the docs of DataFrame.apply
DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)¶
Applies function along input axis of DataFrame.
So, How can I apply a function to a specific column?
In [1]: import pandas as pd
In [2]: data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
In [3]: df = pd.DataFrame(data)
In [4]: df
Out[4]:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
In [5]: def addOne(v):
...: v += 1
...: return v
...:
In [6]: df.apply(addOne, axis=1)
Out[6]:
A B C
0 2 5 8
1 3 6 9
2 4 7 10
I want to addOne to every value in df['A']
, not all columns. How can I do that with DataFrame.apply
.
Thanks for help!
Answers:
The answer is,
df['A'] = df['A'].map(addOne)
and maybe you would be better to know about the difference of map
, applymap
, apply
.
but if you insist to use apply
, you could try like below.
def addOne(v):
v['A'] += 1
return v
df.apply(addOne, axis=1)
One simple way would be:
df['A'] = df['A'].apply(lambda x: x+1)
you can use .apply() with lambda function to solve this kind of problems.
Consider, your dataframe is something like this,
A | B | C
----------
1 | 4 | 7
2 | 5 | 8
3 | 6 | 9
The function which you want to apply:
def addOne(v):
v += 1
return v
So if you write your code like this,
df['A'] = df.apply(lambda x: addOne(x.A), axis=1)
You will get:
A | B | C
----------
2 | 4 | 7
3 | 5 | 8
4 | 6 | 9
For anyone else looking for a solution that allows for pipe-ing:
identity = lambda x: x
def transform_columns(df, mapper):
return df.transform(
{
**{
column: identity
for column in df.columns
},
**mapper
}
)
# you can monkey-patch it on the pandas DataFrame (but don't have to, see below)
pd.DataFrame.transform_columns = transform_columns
(
pd.DataFrame(data)
.rename(columns={'A': 'A1'}) # just to demonstrate the motivation
.transform_columns({'A1': add_one})
)
This also allows to:
pd.DataFrame(data).transform_columns({
'A': add_one,
'B': add_two,
})
And if you do not want to monkey-patch DataFrame, you can always use it with pipe
:
pd.DataFrame(data).pipe(transform_columns, {'A': add_one})
It would be great if this was naively supported by pandas though.
The snippets above are CC0.
I have read the docs of DataFrame.apply
DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)¶
Applies function along input axis of DataFrame.
So, How can I apply a function to a specific column?
In [1]: import pandas as pd
In [2]: data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
In [3]: df = pd.DataFrame(data)
In [4]: df
Out[4]:
A B C
0 1 4 7
1 2 5 8
2 3 6 9
In [5]: def addOne(v):
...: v += 1
...: return v
...:
In [6]: df.apply(addOne, axis=1)
Out[6]:
A B C
0 2 5 8
1 3 6 9
2 4 7 10
I want to addOne to every value in df['A']
, not all columns. How can I do that with DataFrame.apply
.
Thanks for help!
The answer is,
df['A'] = df['A'].map(addOne)
and maybe you would be better to know about the difference of map
, applymap
, apply
.
but if you insist to use apply
, you could try like below.
def addOne(v):
v['A'] += 1
return v
df.apply(addOne, axis=1)
One simple way would be:
df['A'] = df['A'].apply(lambda x: x+1)
you can use .apply() with lambda function to solve this kind of problems.
Consider, your dataframe is something like this,
A | B | C
----------
1 | 4 | 7
2 | 5 | 8
3 | 6 | 9
The function which you want to apply:
def addOne(v):
v += 1
return v
So if you write your code like this,
df['A'] = df.apply(lambda x: addOne(x.A), axis=1)
You will get:
A | B | C
----------
2 | 4 | 7
3 | 5 | 8
4 | 6 | 9
For anyone else looking for a solution that allows for pipe-ing:
identity = lambda x: x
def transform_columns(df, mapper):
return df.transform(
{
**{
column: identity
for column in df.columns
},
**mapper
}
)
# you can monkey-patch it on the pandas DataFrame (but don't have to, see below)
pd.DataFrame.transform_columns = transform_columns
(
pd.DataFrame(data)
.rename(columns={'A': 'A1'}) # just to demonstrate the motivation
.transform_columns({'A1': add_one})
)
This also allows to:
pd.DataFrame(data).transform_columns({
'A': add_one,
'B': add_two,
})
And if you do not want to monkey-patch DataFrame, you can always use it with pipe
:
pd.DataFrame(data).pipe(transform_columns, {'A': add_one})
It would be great if this was naively supported by pandas though.
The snippets above are CC0.