How to apply LabelEncoder for a specific column in Pandas dataframe
Question:
I have a dataset loaded by dataframe where the class label needs to be encoded using LabelEncoder
from scikit-learn. The column label
is the class label column which has the following classes:
[‘Standing’, ‘Walking’, ‘Running’, ‘null’]
To perform label encoding, I tried the following but it does not work. How can I fix it?
from sklearn import preprocessing
import pandas as pd
df = pd.read_csv('dataset.csv', sep=',')
df.apply(preprocessing.LabelEncoder().fit_transform(df['label']))
Answers:
You can try as following:
le = preprocessing.LabelEncoder()
df['label'] = le.fit_transform(df.label.values)
Or following would work too:
df['label'] = le.fit_transform(df['label'])
It will replace original label
values in dataframe with encoded labels.
You can also do:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df.col_name= le.fit_transform(df.col_name.values)
where col_name = the feature that you want to label encode
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:, 2] = le.fit_transform(X[:, 2])
this could be helpful if you want to change the particular column in your CSV data
I have a dataset loaded by dataframe where the class label needs to be encoded using LabelEncoder
from scikit-learn. The column label
is the class label column which has the following classes:
[‘Standing’, ‘Walking’, ‘Running’, ‘null’]
To perform label encoding, I tried the following but it does not work. How can I fix it?
from sklearn import preprocessing
import pandas as pd
df = pd.read_csv('dataset.csv', sep=',')
df.apply(preprocessing.LabelEncoder().fit_transform(df['label']))
You can try as following:
le = preprocessing.LabelEncoder()
df['label'] = le.fit_transform(df.label.values)
Or following would work too:
df['label'] = le.fit_transform(df['label'])
It will replace original label
values in dataframe with encoded labels.
You can also do:
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df.col_name= le.fit_transform(df.col_name.values)
where col_name = the feature that you want to label encode
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
X[:, 2] = le.fit_transform(X[:, 2])
this could be helpful if you want to change the particular column in your CSV data