Polars add column based on calculation throws TypeError: 'Expr' object is not subscriptable

Question:

I’m trying to calculate distance between 2 coordinates based on polars data frame.

import polars as pl
pl.Config.set_fmt_str_lengths(2000)
data={"a": ["782.83    7363.51    6293    40   PD","850.68    7513.1    6262.17    40   PD"], "b": ["795.88    7462.65    6293    40   PD","1061.64    7486.08    6124.85    40   PD"]}
df=pl.DataFrame(data)
df.with_columns((pl.col("a").str.replace_all(r" +"," ").str.split(' ',2)).alias('c'))
df
try:
    dfNew=df.with_columns((pl.col("a").str.replace_all(r" +"," ").str.split(' ',2)[:2]).alias('c'))
except Exception as e:
    print('It's not working - ', e)

To calculate the distance, I need only the first 3 values from the list created by splitting it’s value using space. When I try to do this I get an error message "’Expr’ object is not subscriptable".
How can I overcome it? To calculate the distance, I need do the same transformation to column b and using numpy make distance calculation.

I was trying to use list comprehension and lambda but nothing works.
How to overcome this? Thanks in advance.

Artur

Asked By: Artup

||

Answers:

To take first three elements from list, you should use .arr.slice() method and then cast list[str] to the list of floats list[f64] to provide further calculations

df.with_columns([
    pl.col("a").str.replace_all(r" +", " ")
        .str.split(" ").arr.slice(0,3)
        .cast(pl.List(pl.Float64)).alias("c")
])
Answered By: glebcom
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.