Polars add column based on calculation throws TypeError: 'Expr' object is not subscriptable
Question:
I’m trying to calculate distance between 2 coordinates based on polars data frame.
import polars as pl
pl.Config.set_fmt_str_lengths(2000)
data={"a": ["782.83 7363.51 6293 40 PD","850.68 7513.1 6262.17 40 PD"], "b": ["795.88 7462.65 6293 40 PD","1061.64 7486.08 6124.85 40 PD"]}
df=pl.DataFrame(data)
df.with_columns((pl.col("a").str.replace_all(r" +"," ").str.split(' ',2)).alias('c'))
df
try:
dfNew=df.with_columns((pl.col("a").str.replace_all(r" +"," ").str.split(' ',2)[:2]).alias('c'))
except Exception as e:
print('It's not working - ', e)
To calculate the distance, I need only the first 3 values from the list created by splitting it’s value using space. When I try to do this I get an error message "’Expr’ object is not subscriptable".
How can I overcome it? To calculate the distance, I need do the same transformation to column b and using numpy make distance calculation.
I was trying to use list comprehension and lambda but nothing works.
How to overcome this? Thanks in advance.
Artur
Answers:
To take first three elements from list, you should use .arr.slice()
method and then cast list[str]
to the list of floats list[f64]
to provide further calculations
df.with_columns([
pl.col("a").str.replace_all(r" +", " ")
.str.split(" ").arr.slice(0,3)
.cast(pl.List(pl.Float64)).alias("c")
])
I’m trying to calculate distance between 2 coordinates based on polars data frame.
import polars as pl
pl.Config.set_fmt_str_lengths(2000)
data={"a": ["782.83 7363.51 6293 40 PD","850.68 7513.1 6262.17 40 PD"], "b": ["795.88 7462.65 6293 40 PD","1061.64 7486.08 6124.85 40 PD"]}
df=pl.DataFrame(data)
df.with_columns((pl.col("a").str.replace_all(r" +"," ").str.split(' ',2)).alias('c'))
df
try:
dfNew=df.with_columns((pl.col("a").str.replace_all(r" +"," ").str.split(' ',2)[:2]).alias('c'))
except Exception as e:
print('It's not working - ', e)
To calculate the distance, I need only the first 3 values from the list created by splitting it’s value using space. When I try to do this I get an error message "’Expr’ object is not subscriptable".
How can I overcome it? To calculate the distance, I need do the same transformation to column b and using numpy make distance calculation.
I was trying to use list comprehension and lambda but nothing works.
How to overcome this? Thanks in advance.
Artur
To take first three elements from list, you should use .arr.slice()
method and then cast list[str]
to the list of floats list[f64]
to provide further calculations
df.with_columns([
pl.col("a").str.replace_all(r" +", " ")
.str.split(" ").arr.slice(0,3)
.cast(pl.List(pl.Float64)).alias("c")
])