is_monotonic

checks.is_monotonic(data, column, decreasing=False, strict=True, interval=None, group_by=None)

Verify that values in a column are consecutively increasing or decreasing.

Parameters

data: PolarsLazyOrDataFrame

Polars DataFrame or LazyFrame containing data to check.

column: str

Name of the column that should be monotonic.

decreasing: bool = False

Should the column be decreasing, by default False

strict: bool = True

The series must be stricly increasing or decreasing, no consecutive equal values are allowed, by default True

interval: Optional[Union[int, float, str, pl.Duration]] = None

For time-based column, the interval can be specified as a string as in the function dt.offset_by or pl.DataFrame().rolling. It can also be specified with the pl.duration() function directly in a more explicit manner.

When using a string, the interval is dictated by the following string language:

- 1ns (1 nanosecond)
- 1us (1 microsecond)
- 1ms (1 millisecond)
- 1s (1 second)
- 1m (1 minute)
- 1h (1 hour)
- 1d (1 calendar day)
- 1w (1 calendar week)
- 1mo (1 calendar month)
- 1q (1 calendar quarter)
- 1y (1 calendar year)
- 1i (1 index count)

By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”.

By default None

group_by: Optional[PolarsOverClauseInput] = None

When specified, the monotonic characteristics and intervals are estimated for each group independently.

by default None

Returns

Type	Description
PolarsLazyOrDataFrame	The original polars DataFrame or LazyFrame when the check passes

Examples

>>> import polars as pl
>>> import pelage as plg
>>> df =     given = pl.DataFrame({"int": [1, 2, 1]})
>>> df = pl.DataFrame({"int": [1, 2, 3], "str": ["x", "y", "z"]})
>>> df.pipe(plg.is_monotonic, "int")
shape: (3, 2)
┌─────┬─────┐
│ int ┆ str │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪═════╡
│ 1   ┆ x   │
│ 2   ┆ y   │
│ 3   ┆ z   │
└─────┴─────┘
>>> bad = pl.DataFrame({"data": [1, 2, 3, 1]})
>>> bad.pipe(plg.is_monotonic, "data")
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Details
shape: (2, 1)
┌──────┐
│ data │
│ ---  │
│ i64  │
╞══════╡
│ 3    │
│ 1    │
└──────┘
Error with the DataFrame passed to the check function:
--> Column "data" expected to be monotonic but is not, try .sort("data")

The folloing example details how to perform this checks for groups:

>>> given = pl.DataFrame(
...     [
...         ("2020-01-01 01:42:00", "A"),
...         ("2020-01-01 01:43:00", "A"),
...         ("2020-01-01 01:44:00", "A"),
...         ("2021-12-12 01:43:00", "B"),
...         ("2021-12-12 01:44:00", "B"),
...     ],
...     schema=["dates", "group"],
...     orient="row",
... ).with_columns(pl.col("dates").str.to_datetime())
>>> given.pipe(plg.is_monotonic, "dates", interval="1m", group_by="group")
shape: (5, 2)
┌─────────────────────┬───────┐
│ dates               ┆ group │
│ ---                 ┆ ---   │
│ datetime[μs]        ┆ str   │
╞═════════════════════╪═══════╡
│ 2020-01-01 01:42:00 ┆ A     │
│ 2020-01-01 01:43:00 ┆ A     │
│ 2020-01-01 01:44:00 ┆ A     │
│ 2021-12-12 01:43:00 ┆ B     │
│ 2021-12-12 01:44:00 ┆ B     │
└─────────────────────┴───────┘
>>> given.pipe(plg.is_monotonic, "dates", interval="3m", group_by="group")
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Details
shape: (3, 3)
┌─────────────────────┬───────┬────────────────────────────────┐
│ dates               ┆ group ┆ _previous_entry_with_3m_offset │
│ ---                 ┆ ---   ┆ ---                            │
│ datetime[μs]        ┆ str   ┆ datetime[μs]                   │
╞═════════════════════╪═══════╪════════════════════════════════╡
│ 2020-01-01 01:43:00 ┆ A     ┆ 2020-01-01 01:45:00            │
│ 2020-01-01 01:44:00 ┆ A     ┆ 2020-01-01 01:46:00            │
│ 2021-12-12 01:44:00 ┆ B     ┆ 2021-12-12 01:46:00            │
└─────────────────────┴───────┴────────────────────────────────┘
Error with the DataFrame passed to the check function:
--> Intervals differ from the specified one: 3m.