Pelage: Defensive analysis for Polars
  • Get started
  • API Reference
  • Examples
  • Coming from dbt
  • Git
  1. Checks with group_by
  2. is_monotonic
  • API Reference
  • Check functions
    • has_columns
    • has_dtypes
    • has_no_nulls
    • has_no_infs
    • unique
    • unique_combination_of_columns
    • accepted_values
    • not_accepted_values
    • accepted_range
    • maintains_relationships
    • column_is_within_n_std
    • custom_check
  • Checks with group_by
    • has_shape
    • at_least_one
    • not_constant
    • not_null_proportion
    • has_mandatory_values
    • mutually_exclusive_ranges
    • is_monotonic
  • Exceptions
    • PolarsAssertError

On this page

  • is_monotonic
    • Parameters
    • Returns
    • Examples
  1. Checks with group_by
  2. is_monotonic

is_monotonic

checks.is_monotonic(data, column, decreasing=False, strict=True, interval=None, group_by=None)

Verify that values in a column are consecutively increasing or decreasing.

Parameters

data: PolarsLazyOrDataFrame

Polars DataFrame or LazyFrame containing data to check.

column: str

Name of the column that should be monotonic.

decreasing: bool = False

Should the column be decreasing, by default False

strict: bool = True

The series must be stricly increasing or decreasing, no consecutive equal values are allowed, by default True

interval: Optional[Union[int, float, str, pl.Duration]] = None

For time-based column, the interval can be specified as a string as in the function dt.offset_by or pl.DataFrame().rolling. It can also be specified with the pl.duration() function directly in a more explicit manner.

When using a string, the interval is dictated by the following string language:

- 1ns (1 nanosecond)
- 1us (1 microsecond)
- 1ms (1 millisecond)
- 1s (1 second)
- 1m (1 minute)
- 1h (1 hour)
- 1d (1 calendar day)
- 1w (1 calendar week)
- 1mo (1 calendar month)
- 1q (1 calendar quarter)
- 1y (1 calendar year)
- 1i (1 index count)

By “calendar day”, we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for “calendar week”, “calendar month”, “calendar quarter”, and “calendar year”.

By default None

group_by: Optional[PolarsOverClauseInput] = None

When specified, the monotonic characteristics and intervals are estimated for each group independently.

by default None

Returns

Type Description
PolarsLazyOrDataFrame The original polars DataFrame or LazyFrame when the check passes

Examples

>>> import polars as pl
>>> import pelage as plg
>>> df =     given = pl.DataFrame({"int": [1, 2, 1]})
>>> df = pl.DataFrame({"int": [1, 2, 3], "str": ["x", "y", "z"]})
>>> df.pipe(plg.is_monotonic, "int")
shape: (3, 2)
┌─────┬─────┐
│ int ┆ str │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪═════╡
│ 1   ┆ x   │
│ 2   ┆ y   │
│ 3   ┆ z   │
└─────┴─────┘
>>> bad = pl.DataFrame({"data": [1, 2, 3, 1]})
>>> bad.pipe(plg.is_monotonic, "data")
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Details
shape: (2, 1)
┌──────┐
│ data │
│ ---  │
│ i64  │
╞══════╡
│ 3    │
│ 1    │
└──────┘
Error with the DataFrame passed to the check function:
--> Column "data" expected to be monotonic but is not, try .sort("data")

The folloing example details how to perform this checks for groups:

>>> given = pl.DataFrame(
...     [
...         ("2020-01-01 01:42:00", "A"),
...         ("2020-01-01 01:43:00", "A"),
...         ("2020-01-01 01:44:00", "A"),
...         ("2021-12-12 01:43:00", "B"),
...         ("2021-12-12 01:44:00", "B"),
...     ],
...     schema=["dates", "group"],
...     orient="row",
... ).with_columns(pl.col("dates").str.to_datetime())
>>> given.pipe(plg.is_monotonic, "dates", interval="1m", group_by="group")
shape: (5, 2)
┌─────────────────────┬───────┐
│ dates               ┆ group │
│ ---                 ┆ ---   │
│ datetime[μs]        ┆ str   │
╞═════════════════════╪═══════╡
│ 2020-01-01 01:42:00 ┆ A     │
│ 2020-01-01 01:43:00 ┆ A     │
│ 2020-01-01 01:44:00 ┆ A     │
│ 2021-12-12 01:43:00 ┆ B     │
│ 2021-12-12 01:44:00 ┆ B     │
└─────────────────────┴───────┘
>>> given.pipe(plg.is_monotonic, "dates", interval="3m", group_by="group")
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Details
shape: (3, 3)
┌─────────────────────┬───────┬────────────────────────────────┐
│ dates               ┆ group ┆ _previous_entry_with_3m_offset │
│ ---                 ┆ ---   ┆ ---                            │
│ datetime[μs]        ┆ str   ┆ datetime[μs]                   │
╞═════════════════════╪═══════╪════════════════════════════════╡
│ 2020-01-01 01:43:00 ┆ A     ┆ 2020-01-01 01:45:00            │
│ 2020-01-01 01:44:00 ┆ A     ┆ 2020-01-01 01:46:00            │
│ 2021-12-12 01:44:00 ┆ B     ┆ 2021-12-12 01:46:00            │
└─────────────────────┴───────┴────────────────────────────────┘
Error with the DataFrame passed to the check function:
--> Intervals differ from the specified one: 3m.
mutually_exclusive_ranges
PolarsAssertError