-
-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser is executed after validation, not before #1865
Comments
Hello, I've checked and it seems like if the inner Config class is provided, then those checks (core parsers) will run before the custom parsers. When it tries to add column 'C' to the dataframe due to If the order of execution in /pandera/backends/pandas/container.py: # Collect status of columns against schema
column_info = self.collect_column_info(check_obj, schema)
core_parsers: List[Tuple[Callable[..., Any], Tuple[Any, ...]]] = [
(self.add_missing_columns, (schema, column_info)),
(self.strict_filter_columns, (schema, column_info)),
(self.coerce_dtype, (schema,)),
]
for parser, args in core_parsers:
try:
check_obj = parser(check_obj, *args)
except SchemaError as exc:
error_handler.collect_error(
validation_type(exc.reason_code), exc.reason_code, exc
)
except SchemaErrors as exc:
error_handler.collect_errors(exc.schema_errors)
# run custom parsers
check_obj = self.run_parsers(schema, check_obj) was changed to this, then it would work: # run custom parsers
check_obj = self.run_parsers(schema, check_obj)
# Collect status of columns against schema
column_info = self.collect_column_info(check_obj, schema)
core_parsers: List[Tuple[Callable[..., Any], Tuple[Any, ...]]] = [
(self.add_missing_columns, (schema, column_info)),
(self.strict_filter_columns, (schema, column_info)),
(self.coerce_dtype, (schema,)),
]
for parser, args in core_parsers:
try:
check_obj = parser(check_obj, *args)
except SchemaError as exc:
error_handler.collect_error(
validation_type(exc.reason_code), exc.reason_code, exc
)
except SchemaErrors as exc:
error_handler.collect_errors(exc.schema_errors) I have no idea if this is an intended behavior or not and I haven't checked this any deeper. If this is indeed a bug, I'd be happy to pick this issue up. |
Thanks a lot for the clarification.
Indeed, but for my use-case, where the column In my opinion, this cannot be an intended behavior, especially if this can be fixed (perhaps with your suggestion) without breaking anything else. Or something even cooler is to let users provide a |
I am using this class:
and it complains that
c
is not in the dataframe...It should not, I add
c
in the parser.Plus, if I uncomment that
raise Exception()
line, this exception is never raised as the error I get about missing column is happening before.pandera == 0.19.3
The text was updated successfully, but these errors were encountered: