Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Avoid small batches in Exchange #12010

Closed
wants to merge 1 commit into from

Conversation

arhimondr
Copy link
Contributor

Summary:
Prevent exchange client from unblocking to early. Unblocking to early impedes
effectiveness of page merging. When the cost of creating a vector is high (for
example for data sets with high number of columns) creating small pages can
make queries significantly less efficient.

For example it was observed that when network is congested and Exchange buffers
are not filled up as fast query may experience CPU efficiency drop up to 4x: T211034421

Differential Revision: D67615570

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 3, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

Copy link

netlify bot commented Jan 3, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit d263477
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/67926852cb2a6f0008267cdf

arhimondr added a commit to arhimondr/presto that referenced this pull request Jan 3, 2025
Summary:
X-link: facebookincubator/velox#12010

Prevent exchange client from unblocking to early. Unblocking to early impedes
effectiveness of page merging. When the cost of creating a vector is high (for
example for data sets with high number of columns) creating small pages can
make queries significantly less efficient.

For example it was observed that when network is congested and Exchange buffers
are not filled up as fast query may experience CPU efficiency drop up to 4x: T211034421

Differential Revision: D67615570
@arhimondr arhimondr changed the title Avoid small batches in Exchange feat: Avoid small batches in Exchange Jan 10, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

arhimondr added a commit to arhimondr/velox that referenced this pull request Jan 10, 2025
Summary:

Prevent exchange client from unblocking to early. Unblocking to early impedes
effectiveness of page merging. When the cost of creating a vector is high (for
example for data sets with high number of columns) creating small pages can
make queries significantly less efficient.

For example it was observed that when network is congested and Exchange buffers
are not filled up as fast query may experience CPU efficiency drop up to 4x: T211034421

Differential Revision: D67615570
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr thank for the change % comments.

velox/core/QueryConfig.h Show resolved Hide resolved
velox/exec/ExchangeClient.h Outdated Show resolved Hide resolved
velox/exec/ExchangeQueue.h Outdated Show resolved Hide resolved
velox/exec/ExchangeQueue.h Outdated Show resolved Hide resolved
velox/exec/MergeSource.cpp Show resolved Hide resolved
velox/exec/ExchangeQueue.cpp Outdated Show resolved Hide resolved
velox/exec/ExchangeQueue.cpp Outdated Show resolved Hide resolved
velox/exec/ExchangeQueue.h Outdated Show resolved Hide resolved
velox/exec/tests/ExchangeClientTest.cpp Show resolved Hide resolved
velox/core/QueryConfig.h Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr LGTM. Thanks for the update!

velox/exec/ExchangeQueue.cpp Outdated Show resolved Hide resolved
velox/exec/ExchangeQueue.cpp Outdated Show resolved Hide resolved
velox/exec/MergeSource.cpp Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr thanks for the iterations % nits.

@@ -52,6 +52,7 @@ Exchange::Exchange(
operatorCtx_->driverCtx()->queryConfig(),
serdeKind_)},
processSplits_{operatorCtx_->driverCtx()->driverId == 0},
driverId_{driverCtx->driverId},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can just fetch from operatorCtx_->driverCtx()->driverId and not necessary save a copy of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to safe a couple of extra dereferences :-)

velox/exec/ExchangeQueue.cpp Outdated Show resolved Hide resolved
velox/exec/ExchangeClient.cpp Outdated Show resolved Hide resolved
uint32_t maxBytes,
bool* atEnd,
ContinueFuture* future,
std::vector<ContinuePromise>& promises);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/promises/staledPromises/

Maybe put a comment? And at the caller, we expect it at most has one promise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yeah, let me use a pointer instead of a list. It should never be more than one

velox/exec/ExchangeQueue.h Outdated Show resolved Hide resolved
velox/exec/ExchangeQueue.h Show resolved Hide resolved
Summary:
Prevent exchange client from unblocking to early. Unblocking to early impedes
effectiveness of page merging. When the cost of creating a vector is high (for
example for data sets with high number of columns) creating small pages can
make queries significantly less efficient.

For example it was observed that when network is congested and Exchange buffers
are not filled up as fast query may experience CPU efficiency drop up to 4x: T211034421

Reviewed By: xiaoxmeng

Differential Revision: D67615570
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D67615570

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 121b230.

Copy link

Conbench analyzed the 0 benchmark runs that triggered this notification.

None of the specified runs were found on the Conbench server.

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants