Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9391 Fixes replication break error because of HASH_SCAN #5528

Merged
merged 1 commit into from
Jan 20, 2025

Conversation

VarunNagaraju
Copy link
Contributor

https://perconadev.atlassian.net/browse/PS-9391

Problem

When a replica's slave_rows_search_algorithms is set to HASH_SCAN, the
replication may break with HA_ERR_KEY_NOT_FOUND.

Analysis

When a replica's slave_rows_search_algorithms is set to HASH_SCAN,
it prepares a unique key list for all the rows in a particular Row_event.
The same unique key list will later be used to retrieve all tuples associated
to each key in the list from storage engine. In the case of multiple updates
targeted at the same row like how it is shown in the testcase, it may happen
that this unique key list filled with entries which don't exist yet in the table.
This is a problem when there is an intermediate update which changes the value
of the index column to a lesser value than the original entry and that changed
value is used in another update as shown in the second part of the testcase.
It is an issue because the unique key list is a std::set which internally
sorts it's entries. When this sorting happens, the first entry of the list
could potentially be a value which doesn't exist in the table and the when
it is searched in next_record_scan() method, it fails returning
HA_ERR_KEY_NOT_FOUND error.

Solution

Instead of using std::set to store the distinct keys, a combination of
unordered_set and a list is used to preserve the original order of
updates and avoid duplicates at the same time which prevents the side
effects of sorting.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

sql/log_event.cc Outdated Show resolved Hide resolved
@VarunNagaraju
Copy link
Contributor Author

sql/log_event.h Show resolved Hide resolved
sql/log_event.h Outdated Show resolved Hide resolved
sql/log_event.cc Outdated Show resolved Hide resolved
sql/log_event.cc Show resolved Hide resolved
sql/log_event.cc Outdated Show resolved Hide resolved
@VarunNagaraju
Copy link
Contributor Author

Copy link
Collaborator

@percona-ysorokin percona-ysorokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one m_distinct_key_id = 0 comment addressed

sql/log_event.cc Show resolved Hide resolved
@VarunNagaraju VarunNagaraju marked this pull request as ready for review January 20, 2025 08:42
@VarunNagaraju VarunNagaraju force-pushed the PS-9391 branch 3 times, most recently from 4ab9bcc to 6ab96de Compare January 20, 2025 09:36
https://perconadev.atlassian.net/browse/PS-9391

Problem
=======
When a replica's slave_rows_search_algorithms is set to HASH_SCAN, the
replication may break with HA_ERR_KEY_NOT_FOUND.

Analysis
========
When a replica's slave_rows_search_algorithms is set to HASH_SCAN,
it prepares a unique key list for all the rows in a particular Row_event.
The same unique key list will later be used to retrieve all tuples associated
to each key in the list from storage engine. In the case of multiple updates
targeted at the same row like how it is shown in the testcase, it may happen
that this unique key list filled with entries which don't exist yet in the table.
This is a problem when there is an intermediate update which changes the value
of the index column to a lesser value than the original entry and that changed
value is used in another update as shown in the second part of the testcase.
It is an issue because the unique key list is a std::set which internally
sorts it's entries. When this sorting happens, the first entry of the list
could potentially be a value which doesn't exist in the table and the when
it is searched in next_record_scan() method, it fails returning
HA_ERR_KEY_NOT_FOUND error.

Solution
========
Instead of using std::set to store the distinct keys, a combination of
unordered_set and a list is used to preserve the original order of
updates and avoid duplicates at the same time which prevents the side
effects of sorting.
@VarunNagaraju
Copy link
Contributor Author

@VarunNagaraju VarunNagaraju merged commit 45a1bae into percona:8.0 Jan 20, 2025
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants