[Stress Testing] - Data Catalog and Config Loader #4125

noklam · 2024-08-29T11:56:04Z

Description

To create stress test for individual components:

DataCatalog
ConfigLoader

As for DataCatalog, the most important thing is to test it within the pipeline, CLI and separately by simulating scenarios when calling some methods as (add_feed_dict). Where the tests themselves should include different sets and combinations of parameters, datasets and patterns. #3957 (comment)

The tickets combined the two components as I believe the setup will be similar, or at least there need to be some mechanism to setup different datasets first so I thought it makes sense to bundle these two components together.

Context

#3957 (comment)

Component stress test:

The main goal for this is to benchmark performance of individual component, this will inform if refactoring work has positive/negative impact. Currently we only check if test pass, so we have no idea if a change may slow down performance. We have done this in the past but usually ad-hoc basis, we should run this regularly (or at least per release).

The direction of this is simple, we want to make measure the change of time against # number of entries. We would start with Datasets and Catalog, as this fits in the DataCatalog2.0 work and will be immediately useful.

DataCatalog (test # of datasets with catalog.yml & dataset factory)

ConfigLoader(# of parameters)

Optional: pipelines generated in loops (Dynamic pipeline)

This can address:

Investigate performance of config loading for big projects #3893

DataCatalog 2.0

The text was updated successfully, but these errors were encountered:

astrojuanlu · 2024-09-11T16:12:01Z

xref #4154

ankatiyar · 2024-10-18T15:24:49Z

Keeping this open for performance tests for KedroDataCatalog

noklam added the Issue: Feature Request New feature or improvement to existing feature label Aug 29, 2024

noklam mentioned this issue Aug 29, 2024

Spike: design example kedro projects that can be used to assess performance issues #3957

Closed

github-actions bot mentioned this issue Sep 1, 2024

Monthly issue metrics report #4135

Open

noklam added this to Kedro Framework Sep 2, 2024

merelcht moved this to To Do in Kedro Framework Sep 2, 2024

merelcht added this to the Improve Developer Experience milestone Sep 2, 2024

merelcht assigned noklam and lrcouto Sep 2, 2024

ankatiyar self-assigned this Sep 9, 2024

This was referenced Oct 14, 2024

Performance test for OmegaConfigLoader #4225

Merged

Performance tests for DataCatalog #4230

Merged

merelcht modified the milestones: Improve Developer Experience, Improve performance of Kedro Oct 18, 2024

ankatiyar closed this as completed in #4230 Oct 18, 2024

github-project-automation bot moved this from In Review to Done in Kedro Framework Oct 18, 2024

ankatiyar reopened this Oct 18, 2024

github-project-automation bot moved this from Done to In Progress in Kedro Framework Oct 18, 2024

ankatiyar mentioned this issue Oct 21, 2024

Add benchmarks for KedroDataCatalog and fix tests for DataCatalog #4246

Merged

7 tasks

ankatiyar closed this as completed in #4246 Oct 21, 2024

github-project-automation bot moved this from In Progress to Done in Kedro Framework Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Stress Testing] - Data Catalog and Config Loader #4125

[Stress Testing] - Data Catalog and Config Loader #4125

noklam commented Aug 29, 2024

Component stress test:

astrojuanlu commented Sep 11, 2024

ankatiyar commented Oct 18, 2024

[Stress Testing] - Data Catalog and Config Loader #4125

[Stress Testing] - Data Catalog and Config Loader #4125

Comments

noklam commented Aug 29, 2024

Description

Context

Component stress test:

astrojuanlu commented Sep 11, 2024

ankatiyar commented Oct 18, 2024