Polars: how to add a column with numerical?

Question:

in pandas

df['new'] = a

where a is a numerical Series or just a number.
while in polars we can add a char

df.with_column(
 [
  pl.all(),
  pl.lit('str').alias('new')
 ]
)

but how to add a numerical Series or a number as a new column in polars?
Notice that the new numerical Series is not in the original df, it is a result of some computation.

Asked By: lemmingxuan

||

Answers:

Let’s start with this DataFrame:

import polars as pl
df = pl.DataFrame(
    {
        "col1": [1, 2, 3, 4, 5],
    }
)
print(df)
shape: (5, 1)
┌──────┐
│ col1 │
│ ---  │
│ i64  │
╞══════╡
│ 1    │
├╌╌╌╌╌╌┤
│ 2    │
├╌╌╌╌╌╌┤
│ 3    │
├╌╌╌╌╌╌┤
│ 4    │
├╌╌╌╌╌╌┤
│ 5    │
└──────┘

To add a scalar (single value)

Use polars.lit.

my_scalar = -1
df.with_column(pl.lit(my_scalar).alias("col_scalar"))
shape: (5, 2)
┌──────┬────────────┐
│ col1 ┆ col_scalar │
│ ---  ┆ ---        │
│ i64  ┆ i32        │
╞══════╪════════════╡
│ 1    ┆ -1         │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2    ┆ -1         │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3    ┆ -1         │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 4    ┆ -1         │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 5    ┆ -1         │
└──────┴────────────┘

You can also choose the datatype of the new column using the dtype keyword.

df.with_column(pl.lit(my_scalar, dtype=pl.Float64).alias("col_scalar_float"))
shape: (5, 2)
┌──────┬──────────────────┐
│ col1 ┆ col_scalar_float │
│ ---  ┆ ---              │
│ i64  ┆ f64              │
╞══════╪══════════════════╡
│ 1    ┆ -1.0             │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2    ┆ -1.0             │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3    ┆ -1.0             │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 4    ┆ -1.0             │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 5    ┆ -1.0             │
└──────┴──────────────────┘

To add a list

To add a list of values (perhaps from some external computation), use the polars.Series constructor and provide a name to the Series constructor.

my_list = [10, 20, 30, 40, 50]
df.with_column(pl.Series(name="col_list", values=my_list))
shape: (5, 2)
┌──────┬──────────┐
│ col1 ┆ col_list │
│ ---  ┆ ---      │
│ i64  ┆ i64      │
╞══════╪══════════╡
│ 1    ┆ 10       │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 2    ┆ 20       │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 3    ┆ 30       │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 4    ┆ 40       │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 5    ┆ 50       │
└──────┴──────────┘

You can use the dtype keyword to control the datatype of the new series, if needed.

df.with_column(pl.Series(name="col_list", values=my_list, dtype=pl.Float64))
shape: (5, 2)
┌──────┬──────────┐
│ col1 ┆ col_list │
│ ---  ┆ ---      │
│ i64  ┆ f64      │
╞══════╪══════════╡
│ 1    ┆ 10.0     │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 2    ┆ 20.0     │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 3    ┆ 30.0     │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 4    ┆ 40.0     │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 5    ┆ 50.0     │
└──────┴──────────┘

To add a Series

If you already have a Series, you can just provide a reference to it.

my_series = pl.Series(name="my_series_name", values=[10, 20, 30, 40, 50])
df.with_column(my_series)
shape: (5, 2)
┌──────┬────────────────┐
│ col1 ┆ my_series_name │
│ ---  ┆ ---            │
│ i64  ┆ i64            │
╞══════╪════════════════╡
│ 1    ┆ 10             │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2    ┆ 20             │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3    ┆ 30             │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 4    ┆ 40             │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 5    ┆ 50             │
└──────┴────────────────┘

If your Series does not already have a name, you can provide one using the alias Expression.

my_series_no_name = pl.Series(values=[10, 20, 30, 40, 50])
df.with_column(my_series_no_name.alias('col_no_name'))
shape: (5, 2)
┌──────┬─────────────┐
│ col1 ┆ col_no_name │
│ ---  ┆ ---         │
│ i64  ┆ i64         │
╞══════╪═════════════╡
│ 1    ┆ 10          │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2    ┆ 20          │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 3    ┆ 30          │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 4    ┆ 40          │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 5    ┆ 50          │
└──────┴─────────────┘
Answered By: user18559875
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.