Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source-postgres: Support CDC without watermarks #2202

Merged
merged 7 commits into from
Dec 16, 2024

Conversation

willdonnelly
Copy link
Member

@willdonnelly willdonnelly commented Dec 12, 2024

Description:

This PR replaces the PostgreSQL CDC fence logic which was previously based on detection of our own watermark writes with some new (actually heavily copied from MySQL) fence logic based entirely on WAL LSN positions.

However by default we still write a watermark as part of establishing a fence position, in order to guarantee that there will be at least one WAL event whose LSN is greater than the minimum fence LSN. So the other half of this PR was adding a new advanced setting ("Read-Only Capture") which disables that behavior.

Disabling watermark writes is not without its tradeoffs. Put simply, due to the way Postgres reorders and cleans up the underlying WAL events when doing logical replication, we can't always guarantee that the latest LSN value obtained from pg_current_wal_flush_lsn() (or any of its sibling functions) actually corresponds to an event which we'll be told about. This can most trivially be proven when you consider the situation of an idle database on a server which hosts other active databases, but it's actually possible to run into similar trouble even when there's just the one database on the server. So when enabling read-only mode one must be sure that the server will always have a modest rate of changes on the database which is being captured from.

But the upshot is that we can now run captures without any "writes" to the database (so long as you don't consider replication slot management a write, because that's unavoidable) and it's even possible to capture from read-only standby replicas (with a big big big asterisk that says "except sometimes those setups can be very prone to invalidating replication slots if there's even a tiny bit of downtime"). Yay.

Workflow steps:

To continue using the connector as before, no action is required. The change from watermark value detection to LSN position fencing should be invisible to users, and we still by default write watermarks at the same cadence as before.

To set up a new capture which doesn't write watermarks, enable the "Read-Only Capture" toggle in the advanced endpoint config.

Documentation links affected:

We will need to update PostgreSQL documentation to discuss the new option and the user's responsibility to ensure that the database won't be fully idle when using it.


This change is Reviewable

This commit replaces the core StreamToFence implementation
with a new one (largely the same as MySQL's) based on LSN
positions rather than detection of our own watermark writes.

However the watermark write is still in there as well, which
guarantees consistent behavior. Removing that will be the next
obstacle to read-only Postgres CDC.
There are three "get current LSN" functions in PostgreSQL:

  - pg_current_wal_flush_lsn() is the current flush location. This
    is the last location known to be written out to durable storage.
  - pg_current_wal_lsn() is the current WAL write location. This is
    the end of what's actually been written out from the server's
    internal buffers.
  - pg_current_wal_insert_lsn() is the current WAL insert location.
    This is the "logical" end of the WAL at any instant.

The ordering goes 'flush LSN <= write LSN <= insert LSN'. All of
these values are usable for our purposes: we need our queried LSN
fence position to be >= the observed state as of the previous
backfill query. That query runs with the default "Read Committed"
isolation level, and barring certain rare settings this implies
that we the backfill query can only observe changes which are
prior to the current flush LSN value at the moment the backfill
query executes, which is necessarily <= the value we will query
later on.

So since any of them work, we pick the smallest of the three to
use. This _might_ be slightly more reliable in certain edge cases,
but in all honestly it probably makes no difference.
After the positional-fences refactor we no longer need to receive
actual change events for watermark writes. When they're issued,
it's just to ensure WAL liveness and the KeepaliveEvent that we
receive for inactive tables is sufficient to drive the rest of
the fence logic.
This commit makes three related changes:

- Introduces a new advanced config toggle 'Read Only Capture'
  which disables watermark writing and all watermark-related
  prerequisites.

- Modifies the "get latest LSN" query so that it checks whether
  it's being executed on a standby replica (via `pg_is_in_recovery()`)
  and if so it consults `pg_last_wal_replay_lsn()` rather than
  `pg_current_wal_flush_lsn()` which is still used the rest of
  the time.

- Modifies test logic to support two separate addresses for the
  test setup/control connection and for the captures to connect
  to, plus a test flag for running captures in read-only mode.

Taken together, these changes make it possible to run a capture
in a fully read-only mode of operation (so long as you don't
consider replication slot management to be a form of write).
Most notably it's now possible to run a capture from a read-only
standby replica with the 'Read Only' toggle active, however that
does still come with some caveats around slot invalidation.
It didn't work for LSN values >4GB, which doesn't matter for
local testing most of the time but resulted in spurious test
failures when run against managed cloud Postgres instances.
@willdonnelly willdonnelly requested a review from a team December 12, 2024 23:53
Copy link
Member

@williamhbaker williamhbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Seems like the tests are broken in CI

Just a missing snapshot update and the flag changes around the
separate control/capture DB address needed to be copied over to
the Docker image build.
Previously we've logged the first row of each batch of inserts
during tests, but it's rarely useful to see the value and it's
annoying how it logs megabytes of spam when running the overly
large field values test.

So let's just not do that any more.
@willdonnelly
Copy link
Member Author

Seems like the tests are broken in CI

Whoops, fixed now.

@willdonnelly willdonnelly merged commit 1d1cc22 into main Dec 16, 2024
52 of 53 checks passed
@willdonnelly willdonnelly deleted the wgd/postgres-without-watermarks-20241125 branch December 16, 2024 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants