Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DSIP-79][Task] Add Datavines task to better support data quality #16113

Open
2 tasks done
Tracked by #14102
xxzuo opened this issue Jun 3, 2024 · 9 comments · May be fixed by #16863
Open
2 tasks done
Tracked by #14102

[DSIP-79][Task] Add Datavines task to better support data quality #16113

xxzuo opened this issue Jun 3, 2024 · 9 comments · May be fixed by #16863
Assignees
Labels
DSIP help wanted Extra attention is needed

Comments

@xxzuo
Copy link
Contributor

xxzuo commented Jun 3, 2024

Search before asking

  • I had searched in the DSIP and found no similar DSIP.

Motivation

DataVines is an easy-to-use data quality service platform that supports multiple metric.
https://github.com/datavane/datavines

  • Datavines supports executing multiple metrics in one job.
  • Datavines supports execution status dashboard and data quality report.
  • Datavines supports plug-in extensions for components such as metric, data sources, error data storage, and execution engines.
  • Jdbc engines can be used to execute data quality tasks instead of solely relying on Spark engines.

Design Detail

Sript mode

  1. config data quality job in datavines
    image

  2. get the job config scipt file

  3. Add datavines job node in workflow, and configure the script
    image

API Mode

  1. config data quality job in datavines
    image

  2. get the jobId

  3. Add datavines job node in workflow, and configure the datavines api address and jobId

Compatibility, Deprecation, and Migration Plan

No response

Test Plan

No response

Code of Conduct

@xxzuo xxzuo added DSIP Waiting for reply Waiting for reply labels Jun 3, 2024
@MYiYang
Copy link

MYiYang commented Jun 4, 2024

It would be nice if you could submit a task here and see the status of the task in ds and stop it via datavines

@zhangp8721
Copy link

very useful for data pipeLine

@xiaoshiqiai
Copy link

If the datavines are incorporated into the ds, it will be easier to integrate project management and data inspection

@zixi0825
Copy link
Member

zixi0825 commented Jun 7, 2024

+1

@SbloodyS SbloodyS added help wanted Extra attention is needed and removed Waiting for reply Waiting for reply labels Jun 7, 2024
@ruanwenjun
Copy link
Member

You should provide a detail design related of the how to use the new task and how does the task work in ds, rather than some pictures of ui.

@xxzuo
Copy link
Contributor Author

xxzuo commented Jul 11, 2024

You should provide a detail design related of the how to use the new task and how does the task work in ds, rather than some pictures of ui.

ok, I will supplement the detail design.

@SbloodyS
Copy link
Member

ok, I will supplement the detail design.

Hi, are you still working on this?

@SbloodyS SbloodyS changed the title [DSIP-][Task] Add Datavines task to better support data quality [DSIP-79][Task] Add Datavines task to better support data quality Oct 24, 2024
@SbloodyS SbloodyS mentioned this issue Oct 24, 2024
84 tasks
@zixi0825
Copy link
Member

ok, I will supplement the detail design.

Hi, are you still working on this?

I will come to do this.

@zixi0825
Copy link
Member

zixi0825 commented Oct 27, 2024

Before the new task plugin is completed, shell tasks can be used to integrate datavines, refer to the following guidelines
https://datavane.github.io/datavines-website/docs/integration/dolphin-scheduler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DSIP help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants