Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

external: fix decoding the last key in split keys #59613

Merged
merged 8 commits into from
Feb 25, 2025

Conversation

CbcWestwolf
Copy link
Member

@CbcWestwolf CbcWestwolf commented Feb 18, 2025

What problem does this PR solve?

Issue Number: close #59725

Problem Summary:

When the kv size or kv count of unique key is large, we will call GetRegionSplitKeys, which did not properly consider the case of key.Next(). Then it fails due to the decoding.

What changed and how does it work?

Introduce the method tryDecodeEndKey to decode the last key

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Fix a bug that fails to add unique key using global sort in large data

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 18, 2025
Copy link

codecov bot commented Feb 18, 2025

Codecov Report

Attention: Patch coverage is 66.66667% with 11 lines in your changes missing coverage. Please review.

Project coverage is 73.5585%. Comparing base (5a166e1) to head (baa5e7e).
Report is 40 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #59613        +/-   ##
================================================
+ Coverage   73.0087%   73.5585%   +0.5498%     
================================================
  Files          1694       1729        +35     
  Lines        468366     482591     +14225     
================================================
+ Hits         341948     354987     +13039     
- Misses       105384     105809       +425     
- Partials      21034      21795       +761     
Flag Coverage Δ
integration 45.6590% <0.0000%> (?)
unit 72.2772% <66.6666%> (+0.0690%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.6910% <ø> (ø)
parser ∅ <ø> (∅)
br 44.6709% <ø> (-0.5263%) ⬇️

@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 18, 2025
@ti-chi-bot ti-chi-bot bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 19, 2025
@CbcWestwolf CbcWestwolf force-pushed the date0218 branch 3 times, most recently from 5e7842a to ba52d94 Compare February 20, 2025 09:05
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 20, 2025
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 21, 2025
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 24, 2025
@@ -98,7 +99,6 @@ type Engine struct {
endKey []byte
jobKeys [][]byte
splitKeys [][]byte
regionSplitSize int64
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just remove useless field

@@ -110,8 +110,7 @@ type Engine struct {
// this flag also affects the strategy of loading data, either:
// less load routine + check and read hotspot file concurrently (add-index uses this one)
// more load routine + read each file using 1 reader (import-into uses this one)
checkHotspot bool
mergerIterConcurrency int
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DITTO

@CbcWestwolf
Copy link
Member Author

/retest

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Feb 25, 2025
@CbcWestwolf
Copy link
Member Author

/retest

Copy link

ti-chi-bot bot commented Feb 25, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lance6716, tangenta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Feb 25, 2025
Copy link

ti-chi-bot bot commented Feb 25, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-02-25 02:05:30.436069909 +0000 UTC m=+321478.389228171: ☑️ agreed by lance6716.
  • 2025-02-25 06:04:21.007862159 +0000 UTC m=+335808.961020411: ☑️ agreed by tangenta.

@CbcWestwolf
Copy link
Member Author

/retest

Copy link

ti-chi-bot bot commented Feb 25, 2025

@CbcWestwolf: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-unit-test-ddlv1 8ff59d8 link true /test pull-unit-test-ddlv1

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@CbcWestwolf
Copy link
Member Author

/retest

@ti-chi-bot ti-chi-bot bot merged commit 03f4a2e into pingcap:master Feb 25, 2025
25 checks passed
@CbcWestwolf CbcWestwolf deleted the date0218 branch February 25, 2025 08:14
@ti-chi-bot ti-chi-bot bot added needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. labels Feb 25, 2025
ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Feb 25, 2025
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #59746.
But this PR has conflicts, please resolve them!

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.1: #59747.
But this PR has conflicts, please resolve them!

ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Feb 25, 2025
CbcWestwolf added a commit to ti-chi-bot/tidb that referenced this pull request Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fail to add unique key using global sort in large data
4 participants