How to list, concatenate, and evaluate polars expressions?
Question:
I would like to store in an object (a list, a dictionary or whatever) many different filters, and then be able to select the ones I want and evaluate them in the .filter()
method. Below is an example:
# Sample DataFrame
df = pl.DataFrame(
{"col_a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], "col_b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
)
# Set a couple of filters
filter_1 = pl.col("col_a") > 5
filter_2 = pl.col("col_b") > 8
# Apply filters: this works fine!
df_filtered = df.filter(filter_1 & filter_2)
# Concatenate filters
filters = [filter_1, filter_2]
# This won't work:
df.filter((" & ").join(filters))
df.filter((" | ").join(filters))
What would be the correct way of (" & ").join(filters)
that will work?
Answers:
You can use pl.all()
or pl.any()
>>> df.filter(pl.all(filters))
shape: (2, 2)
┌───────┬───────┐
│ col_a | col_b │
│ --- | --- │
│ i64 | i64 │
╞═══════╪═══════╡
│ 9 | 9 │
│ 10 | 10 │
└───────┴───────┘
>>> df.filter(pl.any(filters))
shape: (5, 2)
┌───────┬───────┐
│ col_a | col_b │
│ --- | --- │
│ i64 | i64 │
╞═══════╪═══════╡
│ 6 | 6 │
│ 7 | 7 │
│ 8 | 8 │
│ 9 | 9 │
│ 10 | 10 │
└───────┴───────┘
I would like to store in an object (a list, a dictionary or whatever) many different filters, and then be able to select the ones I want and evaluate them in the .filter()
method. Below is an example:
# Sample DataFrame
df = pl.DataFrame(
{"col_a": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], "col_b": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}
)
# Set a couple of filters
filter_1 = pl.col("col_a") > 5
filter_2 = pl.col("col_b") > 8
# Apply filters: this works fine!
df_filtered = df.filter(filter_1 & filter_2)
# Concatenate filters
filters = [filter_1, filter_2]
# This won't work:
df.filter((" & ").join(filters))
df.filter((" | ").join(filters))
What would be the correct way of (" & ").join(filters)
that will work?
You can use pl.all()
or pl.any()
>>> df.filter(pl.all(filters))
shape: (2, 2)
┌───────┬───────┐
│ col_a | col_b │
│ --- | --- │
│ i64 | i64 │
╞═══════╪═══════╡
│ 9 | 9 │
│ 10 | 10 │
└───────┴───────┘
>>> df.filter(pl.any(filters))
shape: (5, 2)
┌───────┬───────┐
│ col_a | col_b │
│ --- | --- │
│ i64 | i64 │
╞═══════╪═══════╡
│ 6 | 6 │
│ 7 | 7 │
│ 8 | 8 │
│ 9 | 9 │
│ 10 | 10 │
└───────┴───────┘