-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/adapter: Opt-in migration of sources to the new table model #30483
base: main
Are you sure you want to change the base?
Conversation
MitigationsCompleting required mitigations increases Resilience Coverage.
Risk Summary:The pull request has a high risk score of 81, driven by predictors such as the "Sum Bug Reports Of Files" and the "Delta of Executable Lines". Historically, PRs with these predictors are 115% more likely to cause a bug than the repo baseline. While the observed and predicted bug trends for the repository are decreasing, the presence of 4 file hotspots suggests a heightened risk. Note: The risk score is not based on semantic analysis but on historical predictors of bug occurrence in the repository. The attributes above were deemed the strongest predictors based on that history. Predictors and the score may change as the PR evolves in code, time, and review activity. Bug Hotspots:
|
42c6af0
to
c505cfc
Compare
I've re-opened #30168 with my own fork to avoid CI issues. |
1972e06
to
db98078
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly focused on migrate.rs
but overall the PR LGTM!
02ad145
to
098739e
Compare
0f50e5f
to
cf8f451
Compare
e8930e8
to
619116d
Compare
There seems to be an issue with the migration with at least Kafka sources. If I have the following user objects:
Then I enable the migrations via:
Then I restart Materialize, the following user objects will exist:
That looks correct and is what we would expect to happen. However, if I restart Materialize one more time, then the following user objects will exist:
As you can see, the already migrated objects were migrated again. If I restart Materialize once again, then I see the following:
I'm assuming that each time Materialize is reset the migration is re-run. |
I was able to fix the migration idempotency issue in the most recent batch of commits. Now there's another issue that is triggered in the materialize=> EXPLAIN TIMESTAMP FOR SELECT * FROM kafka_proto_source;
Timestamp
---------------------------------------------------------------------------
query timestamp: 1734549613466 (2024-12-18 19:20:13.466) +
oracle read timestamp: 1734549613466 (2024-12-18 19:20:13.466) +
largest not in advance of upper: 0 (1970-01-01 00:00:00.000) +
upper:[ 0 (1970-01-01 00:00:00.000)]+
since:[ 0 (1970-01-01 00:00:00.000)]+
can respond immediately: false +
timeline: Some(EpochMilliseconds) +
session wall time: 1734549614451 (2024-12-18 19:20:14.451) +
+
source materialize.public.kafka_proto_source (u5, storage): +
read frontier:[ 0 (1970-01-01 00:00:00.000)]+
write frontier:[ 0 (1970-01-01 00:00:00.000)]+
(1 row)
|
5bb01ec
to
cb45af0
Compare
cb45af0
to
254b6ed
Compare
The issue also repros with |
The load gen output looks like this, which is slightly different that the kafka source above:
EDIT: Actually taking a closer look at this, I don't know how to explain it at all. Why would the since be different than the read frontier? |
Co-authored-by: Dennis Felsing <[email protected]>
…rce-table-migration # Conflicts: # test/legacy-upgrade/mzcompose.py
There's still some test failures in Nightly that I'm struggling with: https://buildkite.com/materialize/nightly/builds/10994#0194ae73-a255-49da-923c-47ff93d4216c I'm convinced that they're test issue because both failures happen before the migration and both failures are expecting some query to fail but the query succeeds. |
@jkosh44 Should I try to look into the nightly failures? I assume it's the ones in https://buildkite.com/materialize/nightly/builds/10996 |
If you have the bandwidth, it would be helpful and yes it is. |
New run, rebased and with my fixes for the tests: https://buildkite.com/materialize/nightly/builds/11002 |
Looks like "MySQL CDC tests (before source versioning)" is still failing with this error:
The error usually means that the CREATE was successful but we were expecting it to fail. It already exists because we created it on the first try. It's possible that it was created somewhere else, but that's usually what that error means. |
The curious thing is that this is running (or at least meant to be running) before the source migration happens. I haven't been able to understand why the test is expecting that CREATE to fail though. It looks fine to me. |
a26c7aa
to
26d1ea6
Compare
I think just a test problem, pushed another attempt: https://buildkite.com/materialize/nightly/builds/11003 |
Looks like the same error happened:
|
If I delete all test files except for |
@def- Looking at the testdrive reset logic, it looks like we never actually reset postgres or mysql, only Materialize and Kafka: materialize/src/testdrive/src/lib.rs Lines 109 to 124 in 26b555f
We probably want to reset mysql and postgres too. In the meantime hopefully there's some manual cleanup I can add to the test. |
Alternatively, it's possible that this test was implicitly relying on another test to create the EDIT: That doesn't really make sense that it would pass when all other tests are deleted. |
…rce-table-migration
DROP DATABASE IF EXISTS other; | ||
CREATE DATABASE IF NOT EXISTS mysql; | ||
USE mysql; | ||
|
||
# Insert data pre-snapshot | ||
CREATE TABLE t_in_mysql (f1 INT); | ||
INSERT INTO t_in_mysql VALUES (1), (2); | ||
|
||
> DROP SOURCE IF EXISTS mz_source; | ||
CREATE TABLE mysql.t_in_mysql (f1 INT); | ||
INSERT INTO mysql.t_in_mysql VALUES (1), (2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know which one of these changes caused the test to start passing, but one of them did.
I'm a bit worried about the benchmark failures in the last run, reran just in case: https://buildkite.com/materialize/nightly/builds/11013 |
It looks like the test is hung for 2 hrs on a specific file (
I'm a bit surprised by this. Everything is supposed to be behind a feature flag, so how did it manage to slow down a benchmark. |
Motivation
The subsequent PR will implement https://github.com/MaterializeInc/database-issues/issues/8678, which will also disable use of the 'old style' source statements using the same feature-flag introduced here. Once this PR and that PR land, then enabling the
force_source_table_syntax
flag will completely switch over users to the new syntax.Tips for reviewer
To test this I've added a new scenario to
platform-checks
calledActivateSourceVersioningMigration
, that runs materialize on an existing version for each check'sinitialize()
method, and then restarts materialize on the latest version with theforce_source_table_syntax
, activating the migration of any sources created using the 'old style' syntax. Then thevalidate()
step is run on this new version, confirming that all the queries continue to work.There are already existing
platform-checks
Checks
that use the 'old style' source syntax:TableFromPgSource
,TableFromMySqlSource
,LoadGeneratorAsOfUpTo
, and one I added calledUpsertLegacy
, that cover the 4 source types we need to test. There are also many other checks that use the old syntax when running on 'old' versions before 0.119, but I wasn't sure how to make theActivateSourceVersioningMigration
scenario target a specific version rather than just the 'previous' version for the base run. @def- @nrainer-materialize let me know if you have ideas on doing that.I've also updated the
legacy upgrade tests
to activate this migration after the upgrade which should provide additional coverage too.Nightly
https://buildkite.com/materialize/nightly/builds?branch=rjobanp%3Asource-table-migration
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.