Verification by condition #957
-
Hello! Please tell me (I'm a beginner) how to implement the check: let's say we have a column(col_1) in which the values are: 1.1, 3.6, 5.2 ... NaN and if NaN occurs (in this column), then the values in the other column(col_2) must be less than 100. Thank you in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
hi @RuslanGadzhiev, you'll have to create dataframe-level checks: https://pandera.readthedocs.io/en/stable/checks.html#wide-checks For your use case, that would look something like: import pandera as pa
def custom_check(df):
# create an index-aligned series of True values
check_output = pd.Series(True, index=df.index)
# replace True values with corresponding entries in col2 < 100 where col1 is NA
return check_output.where(df["col1"].isna(), df["col2"] < 100)
pa.DataFrameSchema(
columns={...},
checks=pa.Check(custom_check)
) If you're using the class Schema(pa.SchemaModel):
col1: pa.typing.Series[float]
col2: pa.typing.Series[float]
@pa.dataframe_check
def custom_check(cls, df):
# create an index-aligned series of True values
check_output = pd.Series(True, index=df.index)
# replace True values with corresponding entries in col2 < 100 where col1 is NA
return check_output.where(df["col1"].isna(), df["col2"] < 100) |
Beta Was this translation helpful? Give feedback.
-
It's amazing! Thank you for the answer! I'm sorry, I was worried and didn't write part of my question to the end. Namely: if NaN occurs (in this col_1), then the values in the other column(col_2) must be less than 100 otherwise the ratio col_1/col_2 should be from 0.2 to 1.5. Is it possible to add such a condition? Is this expression correct? schema = pa.DataFrameSchema( |
Beta Was this translation helpful? Give feedback.
hi @RuslanGadzhiev, you'll have to create dataframe-level checks: https://pandera.readthedocs.io/en/stable/checks.html#wide-checks
For your use case, that would look something like:
If you're using the
pa.SchemaModel
API, this would look like: