Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable federated XGBoost using bootstrap aggregation in Task Runner #1151

Open
wants to merge 22 commits into
base: develop
Choose a base branch
from

Conversation

kta-intel
Copy link
Collaborator

@kta-intel kta-intel commented Nov 15, 2024

This PR enables a TaskRunner-based federated XGBoost using the bootstrap aggregation

Specifically this PR:

  • creates an xgb_higgs task runner workspace to train on the higgs dataset [ref] with all required code (i.e. src/taskrunner.py, src/dataloader.py ,plan/*.yaml, etc.
  • adds a tasks_xgb.yaml to enable new FedBaggingXGBoost aggregation when running xgb training workloads
  • adds delta_updates parameter to Aggregator in order to bypass delta updating (for deep learning models getting weight deltas makes sense since the model size should stay relatively consistent, for tree-based algorithms, this makes less sense because more trees are added over time)
    • delta_updates is set to true by default to preserve normal behavior. xgboost taskrunner explicitly sets it to false to bypass it
  • introduces new loader_xgb.py as the backend / superclass to src/dataloader.py
  • introduces new runner_xgb.py as the backend / superclass to src/taskrunner.py
  • introduces new federated boostrap algorithm for xgboost in aggregation_function.fed_bagging which bags the latest trees to a global model, consistent with currently accept federated xgboost algorithms in the industry

Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
Signed-off-by: kta-intel <[email protected]>
This reverts commit d3937ef.

Signed-off-by: kta-intel <[email protected]>
@kta-intel kta-intel changed the title [WIP] Enable federated XGBoost using bootstrap aggregation in Task Runner Enable federated XGBoost using bootstrap aggregation in Task Runner Nov 15, 2024
@kta-intel kta-intel marked this pull request as ready for review November 15, 2024 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant