Change Pandas DataFrame column values from array to first value in array
Question:
I have a DataFrame with a labels
column. The labels
column is currently an array, and I would like to map the entire column to the first item of the array for each row without naively iterating.
For example, if I have 2 rows with labels
values [‘1′,’m’], [‘0’], I would like to map the current values to the new values ‘1’,’0′.
Answers:
You can use .str
for this:
df['labels'] = df['labels'].str[0]
However, it (and other possible ways) is essentially just a loop:
df['labels'] = [x[0] for x in df['labels'] ]
I’d recommend doing a loop since you have better control of error handling, e.g.:
# this handles empty array as well as NaN values
df['labels'] = [x[0] if x else np.nan for x in df['labels']]
I have a DataFrame with a labels
column. The labels
column is currently an array, and I would like to map the entire column to the first item of the array for each row without naively iterating.
For example, if I have 2 rows with labels
values [‘1′,’m’], [‘0’], I would like to map the current values to the new values ‘1’,’0′.
You can use .str
for this:
df['labels'] = df['labels'].str[0]
However, it (and other possible ways) is essentially just a loop:
df['labels'] = [x[0] for x in df['labels'] ]
I’d recommend doing a loop since you have better control of error handling, e.g.:
# this handles empty array as well as NaN values
df['labels'] = [x[0] if x else np.nan for x in df['labels']]