Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Preprocessing #39

Open
amc-corey-cox opened this issue Feb 25, 2025 · 0 comments
Open

Data Preprocessing #39

amc-corey-cox opened this issue Feb 25, 2025 · 0 comments

Comments

@amc-corey-cox
Copy link
Owner

This the tracking issue for all development necessary to preprocess the data before handing it off to the LinkML tools. This will include both the simple preprocessing of fixing simple data anomalies (missing value represented as values, enum inconsistencies, etc.) and complex preprocessing for transformations that may be non-atomic (converting full name to first name last name, joining multiple columns to output multiple columns).

The two main sub-issues will be the simple preprocess tool and the complex preprocess tool Sub-issues related to each of these tools should be child issues of the main sub-issue but other sub-issues related to integrating these into the pipeline should also be here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants