Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SDK] Train API #1962

Merged
merged 33 commits into from
Jan 10, 2024
Merged

[SDK] Train API #1962

merged 33 commits into from
Jan 10, 2024

Conversation

deepanker13
Copy link
Contributor

What this PR does / why we need it:

  1. This pr contains the train api function which will be called by the user to run the training job.
  2. Constants have been added to access them at multiple places.

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Partially Fixes #1945

Checklist:

  • Docs included if any changes are user facing

@coveralls
Copy link

coveralls commented Dec 11, 2023

Pull Request Test Coverage Report for Build 7477774579

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.01%) to 42.896%

Totals Coverage Status
Change from base Build 7477136161: 0.01%
Covered Lines: 3756
Relevant Lines: 8756

💛 - Coveralls

@deepanker13
Copy link
Contributor Author

/hold depends on #1959 and #1958

@deepanker13
Copy link
Contributor Author

/hold cancel

@andreyvelich andreyvelich changed the title Train api [SDK] Train API Dec 14, 2023
@andreyvelich
Copy link
Member

/assign @andreyvelich
@deepanker13 Please can you rebase this PR ?

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @deepanker13 !

sdk/python/kubeflow/training/constants/constants.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/constants/constants.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
@deepanker13
Copy link
Contributor Author

/assign @andreyvelich
@deepanker13 Please can you rebase this PR ?
done

sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/api/training_client.py Outdated Show resolved Hide resolved
sdk/python/kubeflow/training/constants/constants.py Outdated Show resolved Hide resolved
@deepanker13 deepanker13 force-pushed the train_api branch 2 times, most recently from a64b243 to e4df199 Compare December 21, 2023 18:59
@deepanker13 deepanker13 force-pushed the train_api branch 2 times, most recently from 8493b38 to cd811cd Compare January 4, 2024 21:35
@deepanker13
Copy link
Contributor Author

@andreyvelich I have a reason to keep the download dir field, as there will be a single place where we define the default value and that same value will be passed through the code flow, else we will have to verify the values are same or not through the entire code flow.

@@ -1,40 +1,51 @@
from abstract_model_provider import modelProvider
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name the directory storage_initiailizer rather than storage_init_container to be consistent with init container name ?
WDYT @deepanker13 ?

@andreyvelich
Copy link
Member

@andreyvelich I have a reason to keep the download dir field, as there will be a single place where we define the default value and that same value will be passed through the code flow, else we will have to verify the values are same or not through the entire code flow.

@deepanker13 Can you just have 3 constant variables in storage_initializer/constants.py:

INIT_CONTAINER_MOUNT_PATH = "/workspace"
VOLUME_PATH_DATASET = INIT_CONTAINER_MOUNT_PATH + "/dataset"
VOLUME_PATH_MODEL = INIT_CONTAINER_MOUNT_PATH + "/model"

And then just re-use these constants in SDK and storage_initializer.
Does it work for you @deepanker13 ?

@google-oss-prow google-oss-prow bot added size/XL and removed size/L labels Jan 10, 2024
Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, we are ready to merge this PR.
Thanks again @deepanker13 for all of this work!
/lgtm
/assign @johnugeorge @tenzen-y for the final review.

@google-oss-prow google-oss-prow bot added lgtm and removed lgtm labels Jan 10, 2024
@deepanker13
Copy link
Contributor Author

I think, we are ready to merge this PR.
Thanks again @deepanker13 for all of this work!
/lgtm
/assign @johnugeorge @tenzen-y for the final review.

@andreyvelich @johnugeorge @tenzen-y thanks for all the help

@johnugeorge
Copy link
Member

Awesome work @deepanker13
/lgtm
/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deepanker13, johnugeorge

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 006dda4 into kubeflow:master Jan 10, 2024
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants