Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data-plane-gateway deprecation #1627

Open
8 of 13 tasks
jgraettinger opened this issue Sep 13, 2024 · 0 comments
Open
8 of 13 tasks

data-plane-gateway deprecation #1627

jgraettinger opened this issue Sep 13, 2024 · 0 comments
Labels
change:planned This is a planned change

Comments

@jgraettinger
Copy link
Member

jgraettinger commented Sep 13, 2024

Federated data-planes are mostly out, with a couple lingering issues related to data-plane-gateway:

  • The UI is unable to fetch shard status or logs from tasks in new data-planes.
  • The connector networking feature is inoperative in new data-planes.

Both issues revolve around data-plane-gateway, which is a legacy component that pre-dates the federated data-planes work (which introduced "TLS everywhere", a full-fledged authorization system for brokers and reactors, and an authorization API within the control plane for brokered access).

High level, we seek to remove data-plane-gateway altogether and enable a) direct UI access to data-planes brokered through an authorization API and b) direct support for connector networking within data-planes.

Getting shard status, listing, and collection data preview working:

  • New control-plane /authorize/user/task API
    • This API enables shard listings / status and retrieval of task logs, as well as access to private connector networks.
    • Request:
      • Control-plane access token
      • Task catalog name
    • Response:
      • broker service address & access token for LIST+READ of ops journals
      • reactor service address & access token for LIST+READ+PROXY_CONNECTOR of shards
      • concrete names of ops log & stat journal partitions for the task
      • shard_id_prefix for task shards
  • New control-plane /authorize/user/collection API
    • This API enables listing and reads of collection partitions.
    • Request:
      • Control-plane access token
      • Collection name
    • Response:
      • broker service address & access token for LIST+READ
      • journal_name_prefix for collection partitions
  • Align authorization APIs to work with data-plane-gateway
    • Tokens include legacy prefixes claims used by old DPG auth checks.
    • Dynamically re-write cluster-internal addresses to use DPG
  • Gazette directly serves grpc-web adapters for Journals and Shards gRPC services.
    • Given a $token, the UI is able to directly query a data-plane gazette or reactor service address for these APIs.
  • UI: call new authorization APIs for status / listing / preview
    • Do not fetch legacy authorization from PostgREST
    • Instead, call dynamic broker / reactor address with dynamic token response.
  • Update flowctl to use new authorization APIs for interacting with tasks and collections.

Discussion

For this stage, we're leaving DPG in place and unmodified, but are routing "around" it via new authorization APIs. These APIs will include DPG addresses for tasks / collections in the legacy data-plane, but the caller is able to use these addresses without being aware that it's the DPG versus a new data-plane. New data-planes speak the same grpc-web API which DPG provides.

The legacy data-plane cannot directly be reached by the UI, which means DPG must still be in the loop. The authorization APIs have compatibility measures in place which mean the unmodified DPG is able to use access tokens minted by these new APIs.

Getting connector networking working:

  • Reactors directly implement data-plane-gateway-esque ALPN negotiation
    • On handshake, determine if the subdomain is destined for a connector (like 37b065a8796c8d1-8080), the reactor.the-data-plane.dp.estuary-data.com service address, or a specific reactor-XYZ-003.reactor.the-data-plane.dp.estuary-data.com host underneath.
    • Connections directed to the reactor are handled by the current gRPC + HTTP stack.
    • Connector connections are resolved against the shard KeySpace (this saves a network round-trip compared to DPG).
    • Public ports proxy through the ProxyConnector gRPC service using a self-signed PROXY_CONNECTOR claim.
    • Private ports call out to /authorize/user/task to obtain an authorization token, which is then used with ProxyConnector.

Discussion

This works because reactors already use a wildcard TLS cert *.reactor.the-data-plane.estuary-data.com, reactor.the-data-plane.estuary-data.com. We can add a wildcard DNS entry for the service address so that TLS connections to 37b065a8796c8d1-8080.reactor.the-data-plane.dp.estuary-data.com route to a reactor, which can then examine the subdomain to determine what kind of connection it is.

DPG will continue to serve connector connections for the legacy data-plane but will never support new data-planes.

Allowing cross data-plane access into the legacy data-plane.

  • data-plane gateway is updated to be a simple proxy which passes through an Authorization header but doesn't inspect it.
    • It proxies gRPC journal Append and journal + shard Apply APIs
    • Must be updated for streaming List.
    • Must load balancing across brokers!
  • /authorize/task API transparently re-writes legacy data-plane broker address to use DPG in cross-data-plane contexts

Discussion

It's already the case that the legacy data-plane can read or write to new data-planes, but we can also achieve the inverse and allow tasks in new data-planes to read or write collections in the legacy data-plane.

Tasks in new data-planes will transparently interact with the DPG instead of brokers, where the DPG proxies Read/List/Append on their behalf. A gotcha to watch out for is that instances of DPG spread RPCs around the cluster and don't, for example, send all List watches to a single broker.

Cleanups

  • data-plane-gateway re-uses native grpc-web shims which is now part of Gazette proper
  • data-plane-gateway re-uses connector networking which is now part of Flow proper
  • once legacy data-plane is migrated, addresses in data_planes row are updated and DPG grpc-web APIs immediately stop being used
  • Work with customers to transition all uses of DPG to directly use the reactor service, and turn of DPG when done.
@jgraettinger jgraettinger self-assigned this Sep 13, 2024
@jgraettinger jgraettinger added the change:planned This is a planned change label Sep 16, 2024
jgraettinger added a commit that referenced this issue Sep 16, 2024
`authorize/user/task` enables UI shard listings/status and retrieval
of task logs, as well as access to private connector networking.

`authorize/user/collection` enables UI journal listing and data preview.

Both offer temporary support for the current data-plane-gateway,
which implements legacy authorization checks using claimed prefixes.

Also introduce an address rewrite mechanism for mapping an internal
data-plane legacy service address into the data-plane-gateway address in
external call contexts.

Issue #1627
jgraettinger added a commit that referenced this issue Sep 16, 2024
…utes

`/authorize/user/task` enables UI shard listings/status and retrieval
of task logs, as well as access to private connector networking.

`/authorize/user/collection` enables UI journal listing and data preview.

Both offer temporary support for the current data-plane-gateway,
which implements legacy authorization checks using claimed prefixes.

Also introduce an address rewrite mechanism for mapping an internal
data-plane legacy service address into the data-plane-gateway address in
external call contexts.

Issue #1627
jgraettinger added a commit that referenced this issue Sep 16, 2024
…utes

`/authorize/user/task` enables UI shard listings/status and retrieval
of task logs, as well as access to private connector networking.

`/authorize/user/collection` enables UI journal listing and data preview.

Both offer temporary support for the current data-plane-gateway,
which implements legacy authorization checks using claimed prefixes.

Also introduce an address rewrite mechanism for mapping an internal
data-plane legacy service address into the data-plane-gateway address in
external call contexts.

Issue #1627
jgraettinger added a commit that referenced this issue Sep 17, 2024
…utes

`/authorize/user/task` enables UI shard listings/status and retrieval
of task logs, as well as access to private connector networking.

`/authorize/user/collection` enables UI journal listing and data preview.

Both offer temporary support for the current data-plane-gateway,
which implements legacy authorization checks using claimed prefixes.

Also introduce an address rewrite mechanism for mapping an internal
data-plane legacy service address into the data-plane-gateway address in
external call contexts.

Issue #1627
jgraettinger added a commit that referenced this issue Sep 17, 2024
…ions

This change introduces the agent API to `flowctl`, which is the
proverbial straw which motivated a deeper refactor of flowctl
configuration.

As a headline feature, `flowctl` supports the new task and collection
authorization APIs and uses them in support of serving existing
subcommands for reading collections, previews, and read ops logs or
stats.

Clean up management of access and refresh tokens by obtaining access
tokens or generating refresh tokens prior to calling into a particular
sub-command. Preserve the ability to run `flowctl` in an unauthenticated
mode.

Make it easier to use `flowctl` against a local stack by introducing
alternative defaults when running under a "local" profile.

Also fix handling of single-use refresh tokens, where we must retain the
updated secret after using it to generate a new access token. We could
now consider having `flowctl` create single-use refresh tokens rather
than multi-use ones, but I didn't want to take that step just yet.

Also fix mis-ordering of output when reading journals.

Also fix OffsetNotYetAvailable error when reading a journal in non-blocking mode.

Issue #1627
jgraettinger added a commit that referenced this issue Sep 20, 2024
…utes

`/authorize/user/task` enables UI shard listings/status and retrieval
of task logs, as well as access to private connector networking.

`/authorize/user/collection` enables UI journal listing and data preview.

Both offer temporary support for the current data-plane-gateway,
which implements legacy authorization checks using claimed prefixes.

Also introduce an address rewrite mechanism for mapping an internal
data-plane legacy service address into the data-plane-gateway address in
external call contexts.

Issue #1627
jgraettinger added a commit that referenced this issue Sep 20, 2024
…ions

This change introduces the agent API to `flowctl`, which is the
proverbial straw which motivated a deeper refactor of flowctl
configuration.

As a headline feature, `flowctl` supports the new task and collection
authorization APIs and uses them in support of serving existing
subcommands for reading collections, previews, and read ops logs or
stats.

Clean up management of access and refresh tokens by obtaining access
tokens or generating refresh tokens prior to calling into a particular
sub-command. Preserve the ability to run `flowctl` in an unauthenticated
mode.

Make it easier to use `flowctl` against a local stack by introducing
alternative defaults when running under a "local" profile.

Also fix handling of single-use refresh tokens, where we must retain the
updated secret after using it to generate a new access token. We could
now consider having `flowctl` create single-use refresh tokens rather
than multi-use ones, but I didn't want to take that step just yet.

Also fix mis-ordering of output when reading journals.

Also fix OffsetNotYetAvailable error when reading a journal in non-blocking mode.

Issue #1627
jgraettinger added a commit that referenced this issue Sep 20, 2024
…utes

`/authorize/user/task` enables UI shard listings/status and retrieval
of task logs, as well as access to private connector networking.

`/authorize/user/collection` enables UI journal listing and data preview.

Both offer temporary support for the current data-plane-gateway,
which implements legacy authorization checks using claimed prefixes.

Also introduce an address rewrite mechanism for mapping an internal
data-plane legacy service address into the data-plane-gateway address in
external call contexts.

Issue #1627
jgraettinger added a commit that referenced this issue Sep 20, 2024
…ions

This change introduces the agent API to `flowctl`, which is the
proverbial straw which motivated a deeper refactor of flowctl
configuration.

As a headline feature, `flowctl` supports the new task and collection
authorization APIs and uses them in support of serving existing
subcommands for reading collections, previews, and read ops logs or
stats.

Clean up management of access and refresh tokens by obtaining access
tokens or generating refresh tokens prior to calling into a particular
sub-command. Preserve the ability to run `flowctl` in an unauthenticated
mode.

Make it easier to use `flowctl` against a local stack by introducing
alternative defaults when running under a "local" profile.

Also fix handling of single-use refresh tokens, where we must retain the
updated secret after using it to generate a new access token. We could
now consider having `flowctl` create single-use refresh tokens rather
than multi-use ones, but I didn't want to take that step just yet.

Also fix mis-ordering of output when reading journals.

Also fix OffsetNotYetAvailable error when reading a journal in non-blocking mode.

Issue #1627
@jgraettinger jgraettinger removed their assignment Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:planned This is a planned change
Projects
None yet
Development

No branches or pull requests

1 participant