Dataframe string slicing
Question:
I have a Dataframe which contains a column where the values are:
- abc_0.1
- aabbcc_-0.010
- qwerty_0.555
How can I use the lambda function to transform the column values into simply numeric values:
- 0.1
- -0.010
- 0.555
Answers:
Does this answer your question ?
df = pd.DataFrame({'col': [
'abc_0.1',
'aabbcc_-0.010',
'qwerty_0.555',
]})
df['col'] = df['col'].str.extract(r'[a-zA-Z]+_(.*)').astype(float)
df
col
0 0.100
1 -0.010
2 0.555
You can use str.extract
with the regex (-?d+(?:.d+)?)$
and optionally convert to_numeric
:
df['num'] = pd.to_numeric(df['col'].str.extract(r'(-?d+(?:.d+)?)$', expand=False))
output:
col num
0 abc_0.1 0.100
1 aabbcc_-0.010 -0.010
2 qwerty_0.555 0.555
Regex:
-? # optionally match a - sign
d+ # match one or more digits
(?:.d+)? # optionally match a dot and digit(s)
$ # match end of string
#extract a group comprising of any digit, period or a minus sign, occurring one or more times
df['text'].str.extract(r'([d.-]+)' )
0
0 0.1
1 -0.1
2 0.555
I have a Dataframe which contains a column where the values are:
- abc_0.1
- aabbcc_-0.010
- qwerty_0.555
How can I use the lambda function to transform the column values into simply numeric values:
- 0.1
- -0.010
- 0.555
Does this answer your question ?
df = pd.DataFrame({'col': [
'abc_0.1',
'aabbcc_-0.010',
'qwerty_0.555',
]})
df['col'] = df['col'].str.extract(r'[a-zA-Z]+_(.*)').astype(float)
df
col
0 0.100
1 -0.010
2 0.555
You can use str.extract
with the regex (-?d+(?:.d+)?)$
and optionally convert to_numeric
:
df['num'] = pd.to_numeric(df['col'].str.extract(r'(-?d+(?:.d+)?)$', expand=False))
output:
col num
0 abc_0.1 0.100
1 aabbcc_-0.010 -0.010
2 qwerty_0.555 0.555
Regex:
-? # optionally match a - sign
d+ # match one or more digits
(?:.d+)? # optionally match a dot and digit(s)
$ # match end of string
#extract a group comprising of any digit, period or a minus sign, occurring one or more times
df['text'].str.extract(r'([d.-]+)' )
0
0 0.1
1 -0.1
2 0.555