source-mysql: Fix primary-key update logic when applied to only-changes bindings #2201
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
We recently (in #2185) added some logic to
source-mysql
which detects row update events which modify the collection key of the row. Or more pedantically, it detects row updates which modify the value of "whatever columns we consider the row key to be made of", which is almost always the collection key.(We play a little fast-and-loose with the concepts at times and it works out because in practice nobody ever sets the collection key to something other than the source DB primary key or tries to tells us to backfill using some other key distinct from the collection key. In fact we should probably deprecate the ability to have a backfill key distinct from the collection key since it's an undocumented backwards-compatibility feature from years ago and I bet nobody is actually using it).
This distinction matters because in the past bindings with the
"Only Changes"
backfill mode haven't needed to care what the backfill key / scan key / collection key / whatever-you-want-to-call-it is set to, and so we haven't even initialized that state property for such bindings. They go straight from pending to active and don't setKeyColumns
along the way.But now, we need to detect update events which modify the row's key, even if they're on an
"Only Changes"
binding. This means that either we need to add entirely new plumbing into the replication logic for "what properties correspond to the collection key", or we need to just accept that this is already whatKeyColumns
holds and therefore it's a bug that we're not setting it on only-changes bindings.I have opted to go with the latter solution because it's simpler and feels more elegant anyway.
Workflow steps:
No user action is required, and the new primary-key update logic in MySQL should now work reliably for only-changes bindings going forward.
Notes for reviewers:
This PR is pretty small but if you find it at all confusing it might be clearer when viewed commit-by-commit. It has three parts:
activatePendingStreams
so that any future only-changes bindings will have a correctly initializedKeyColumns
property.reconcileStateWithBindings
which identifies preexisting only-changes bindings which are already active with a nilKeyColumns
and fills in the appropriate value. This can be removed in the near future after it's had a chance to run everywhere that matters.This change is