has_mandatory_values
checks.has_mandatory_values(data, items, group_by=None)
Ensure that all specified values are present in their respective column.
Parameters
data: PolarsLazyOrDataFrame
-
Polars DataFrame or LazyFrame containing data to check.
items: Dict[str, list]
-
A dictionnary where the keys are the columns names and the values are lists that contains all the required values for a given column.
group_by: Optional[PolarsOverClauseInput] = None
-
When specified perform the check per group instead of the whole column, by default None
Returns
Type | Description |
---|---|
PolarsLazyOrDataFrame | The original polars DataFrame or LazyFrame when the check passes |
Examples
>>> import polars as pl
>>> import pelage as plg
>>> df = pl.DataFrame({"a": [1, 2]})
>>> df.pipe(plg.has_mandatory_values, {"a": [1, 2]})
2, 1)
shape: (
┌─────┐
│ a │--- │
│
│ i64 │
╞═════╡1 │
│ 2 │
│
└─────┘>>> df.pipe(plg.has_mandatory_values, {"a": [3, 4]})
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Detailswith the DataFrame passed to the check function:
Error -->Missing mandatory values in the following columns: {'a': [3, 4]}
The folloing example details how to perform this checks for groups:
>>> group_df_example = pl.DataFrame(
... {"a": [1, 1, 1, 2],
... "group": ["G1", "G1", "G2", "G2"],
...
... }
... )>>> group_df_example.pipe(plg.has_mandatory_values, {"a": [1, 2]})
4, 2)
shape: (
┌─────┬───────┐
│ a ┆ group │--- ┆ --- │
│ str │
│ i64 ┆
╞═════╪═══════╡1 ┆ G1 │
│ 1 ┆ G1 │
│ 1 ┆ G2 │
│ 2 ┆ G2 │
│
└─────┴───────┘>>> group_df_example.pipe(plg.has_mandatory_values, {"a": [1, 2]}, group_by="group")
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Details1, 3)
shape: (
┌───────┬───────────┬────────────────┐
│ group ┆ a ┆ a_expected_set │--- ┆ --- ┆ --- │
│ str ┆ list[i64] ┆ list[i64] │
│
╞═══════╪═══════════╪════════════════╡1] ┆ [1, 2] │
│ G1 ┆ [
└───────┴───────────┴────────────────┘with the DataFrame passed to the check function:
Error -->Some groups are missing mandatory values