-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Customizable column names and extra config placeholder #127
base: main
Are you sure you want to change the base?
Feature: Customizable column names and extra config placeholder #127
Conversation
@hrfmartins thank you for the contribution. Really appreciated. I have a few requests. Can you please sign your commits? We require all commits to be signed with GPG key. Please also run We currently have an issue with running integration tests triggered from forks. Your PR may be blocked at the moment. |
dbc3dd2
to
a335d2a
Compare
@mwojtyczka Signing with GPG done and lint + fmt ran and issues fixed :) Sorry for the inconvenience. Is there anything I can/need to do about the fork issue? Thank you |
Thank you! We are working on fixing the fork issue. Will keep you posted. |
…r other future configurations
ed300d9
to
a09b0fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hrfmartins
Can you please extend the existing guide to show how to customize the reporting columns. Perhaps a new section, sth like "Additional configuration" before the custom checks:
https://github.com/databrickslabs/dqx/blob/main/docs/dqx/docs/guide.mdx#quality-rules-and-creation-of-custom-checks
Can you please also extend demos, probably a new cell here:
https://github.com/databrickslabs/dqx/blob/main/demos/dqx_demo_library.py#L286
Co-authored-by: Marcin Wojtyczka <[email protected]>
Co-authored-by: Marcin Wojtyczka <[email protected]>
…custom column mappings against apply_checks_by_metadata_and_split
…s' into feature/customizable-column-names
@mwojtyczka Just added documentation and a new entry on the notebook of the demos. |
dq_engine = DQEngine(ws, extra_params=ExtraParams(column_names={'errors': 'ERROR', 'warnings': 'WARN'})) | ||
test_df = spark.createDataFrame([[1, 3, 3], [2, None, 4], [None, 4, None], [None, None, None]], SCHEMA) | ||
|
||
checks = [{"criticality": "warn", "check": {"function": "col_test_check_func", "arguments": {"col_name": "a"}}}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this test more focused. Please use regular check not a custom one. Currently col_test_check_func
is used which is a custome check.
|
||
|
||
checks = [ | ||
{"criticality": "warn", "check": {"function": "col_test_check_func", "arguments": {"col_name": "a"}}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use one of standard checks, not a custom check.
Co-authored-by: Marcin Wojtyczka <[email protected]>
Co-authored-by: Marcin Wojtyczka <[email protected]>
Co-authored-by: Marcin Wojtyczka <[email protected]>
Co-authored-by: Marcin Wojtyczka <[email protected]>
Co-authored-by: Marcin Wojtyczka <[email protected]>
Co-authored-by: Marcin Wojtyczka <[email protected]>
Co-authored-by: Marcin Wojtyczka <[email protected]>
Co-authored-by: Marcin Wojtyczka <[email protected]>
Thank you! |
In this PR I implemented a placeholder for extra configurations for DQEngine. I also included customizable column names to replace the custom names.
Changes
Linked issues
Resolves #46
Tests