Pelage: Defensive analysis for Polars
  • Get started
  • API Reference
  • Examples
  • Coming from dbt
  • Git
  1. Checks with group_by
  2. has_shape
  • API Reference
  • Check functions
    • has_columns
    • has_dtypes
    • has_no_nulls
    • has_no_infs
    • unique
    • unique_combination_of_columns
    • accepted_values
    • not_accepted_values
    • accepted_range
    • maintains_relationships
    • column_is_within_n_std
    • custom_check
  • Checks with group_by
    • has_shape
    • at_least_one
    • not_constant
    • not_null_proportion
    • has_mandatory_values
    • mutually_exclusive_ranges
    • is_monotonic
  • Exceptions
    • PolarsAssertError

On this page

  • has_shape
    • Parameters
    • Returns
    • Examples
  1. Checks with group_by
  2. has_shape

has_shape

checks.has_shape(data, shape, group_by=None)

Check if a DataFrame has the specified shape.

When used with the group_by option, this can be used to get the row count per group.

Parameters

data: PolarsLazyOrDataFrame

Input data

shape: Tuple[IntOrNone, IntOrNone]

Tuple with the expected dataframe shape, as from the .shape() method. You can use None for one of the two elements of the shape tuple if you do not want to check this dimension.

Ex: (5, None) will ensure that the dataframe has 5 rows regardless of the number of columns.

group_by: Optional[PolarsOverClauseInput] = None

When specified compares the number of lines per group with the expected value, by default None

Returns

Type Description
PolarsLazyOrDataFrame The original polars DataFrame or LazyFrame when the check passes

Examples

>>> import polars as pl
>>> import pelage as plg
>>> df = pl.DataFrame({"a": [1, 2], "b": ["a", "b"]})
>>> df.pipe(plg.has_shape, (2, 2))
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪═════╡
│ 1   ┆ a   │
│ 2   ┆ b   │
└─────┴─────┘
>>> df.pipe(plg.has_shape, (2, None))
shape: (2, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ str │
╞═════╪═════╡
│ 1   ┆ a   │
│ 2   ┆ b   │
└─────┴─────┘
>>> df.pipe(plg.has_shape, (1, 2))
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Details
Error with the DataFrame passed to the check function:
--> The data has not the expected shape: (1, 2)
>>> group_example_df = pl.DataFrame(
...     {
...         "a": [1, 2, 3],
...         "b": ["a", "b", "b"],
...     }
... )
>>> group_example_df.pipe(plg.has_shape, (1, None), group_by="b")
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Details
shape: (1, 2)
┌─────┬─────┐
│ b   ┆ len │
│ --- ┆ --- │
│ str ┆ u32 │
╞═════╪═════╡
│ b   ┆ 2   │
└─────┴─────┘
Error with the DataFrame passed to the check function:
--> The number of rows per group does not match the specified value: 1
custom_check
at_least_one