maintains_relationships
checks.maintains_relationships(data, other_df, column)
Function to help ensuring that set of values in selected column remains the same in both DataFrames. This helps to maintain referential integrity.
Parameters
data: PolarsLazyOrDataFrame
-
Dataframe after transformation
other_df: Union[pl.DataFrame, pl.LazyFrame]
-
Distant dataframe usually the one before transformation
column: str
-
Column to check for keys/ids
Returns
Type | Description |
---|---|
PolarsLazyOrDataFrame | The original polars DataFrame or LazyFrame when the check passes |
Examples
>>> import polars as pl
>>> import pelage as plg
>>> initial_df = pl.DataFrame({"a": ["a", "b"]})
>>> final_df = pl.DataFrame({"a": ["a", "b"]})
>>> final_df.pipe(plg.maintains_relationships, initial_df, "a")
2, 1)
shape: (
┌─────┐
│ a │--- │
│ str │
│
╞═════╡
│ a │
│ b │
└─────┘>>> final_df = pl.DataFrame({"a": ["a"]})
>>> final_df.pipe(plg.maintains_relationships, initial_df, "a")
Traceback (most recent call last):
...
pelage.checks.PolarsAssertError: Detailswith the DataFrame passed to the check function:
Error -->Some values were removed from col 'a', for ex: ('b',)