-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first steps towards data-plane-gateway deprecation #1628
Conversation
Calling $ curl -v -H "origin: http://localhost:3000" -H "authorization: Bearer $CONTROL_TOKEN" -H "content-type: application/json" --data-binary '{"task":"johnnyCo/hello/source-hello-world"}' http://localhost:8675/authorize/user/task | jq
* Host localhost:8675 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying [::1]:8675...
* Connected to localhost (::1) port 8675
> POST /authorize/user/task HTTP/1.1
> Host: localhost:8675
> User-Agent: curl/8.5.0
> Accept: */*
> origin: http://localhost:3000
> authorization: Bearer (trim)
> content-type: application/json
> Content-Length: 44
>
} [44 bytes data]
< HTTP/1.1 200 OK
< content-type: application/json
< content-length: 2007
< vary: origin, access-control-request-method, access-control-request-headers
< access-control-allow-origin: http://localhost:3000
< date: Mon, 16 Sep 2024 02:29:42 GMT
<
{ [2007 bytes data]
100 2051 100 2007 100 44 446k 10029 --:--:-- --:--:-- --:--:-- 500k
* Connection #0 to host localhost left intact
{
"brokerAddress": "http://localhost:8080",
"brokerToken": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJjYXAiOjEwLCJleHAiOjE3MjY0NTY0MjMsImlhdCI6MTcyNjQ1Mzc4MiwiaXNzIjoibG9jYWwtY2x1c3Rlci5kcC5lc3R1YXJ5LWRhdGEuY29tIiwic2VsIjp7ImluY2x1ZGUiOnsibGFiZWxzIjpbeyJuYW1lIjoibmFtZSIsInZhbHVlIjoib3BzL3Rhc2tzL3B1YmxpYy9sb2NhbC1jbHVzdGVyL2xvZ3MvMGViODMwYTI3OTgwMDE5YS9raW5kPWNhcHR1cmUvbmFtZT1qb2hubnlDbyUyRmhlbGxvJTJGc291cmNlLWhlbGxvLXdvcmxkL3Bpdm90PTAwIn0seyJuYW1lIjoibmFtZSIsInZhbHVlIjoib3BzL3Rhc2tzL3B1YmxpYy9sb2NhbC1jbHVzdGVyL3N0YXRzLzBlYjgzMGEyNzk4MDAxOWEva2luZD1jYXB0dXJlL25hbWU9am9obm55Q28lMkZoZWxsbyUyRnNvdXJjZS1oZWxsby13b3JsZC9waXZvdD0wMCJ9XX19LCJzdWIiOiI2ZjMxYmUxYS1kODc1LTQ1M2QtOTA4ZS03OGQ0ZjJlZjFlMWYiLCJwcmVmaXhlcyI6WyJvcHMvdGFza3MvcHVibGljL2xvY2FsLWNsdXN0ZXIvbG9ncy8wZWI4MzBhMjc5ODAwMTlhL2tpbmQ9Y2FwdHVyZS9uYW1lPWpvaG5ueUNvJTJGaGVsbG8lMkZzb3VyY2UtaGVsbG8td29ybGQvcGl2b3Q9MDAiLCJvcHMvdGFza3MvcHVibGljL2xvY2FsLWNsdXN0ZXIvc3RhdHMvMGViODMwYTI3OTgwMDE5YS9raW5kPWNhcHR1cmUvbmFtZT1qb2hubnlDbyUyRmhlbGxvJTJGc291cmNlLWhlbGxvLXdvcmxkL3Bpdm90PTAwIl19.a4ALlxPQGcZzqLGLl9o8y_0M-_6yv4Fpn8MmFoHXO2w",
"opsLogsJournal": "ops/tasks/public/local-cluster/logs/0eb830a27980019a/kind=capture/name=johnnyCo%2Fhello%2Fsource-hello-world/pivot=00",
"opsStatsJournal": "ops/tasks/public/local-cluster/stats/0eb830a27980019a/kind=capture/name=johnnyCo%2Fhello%2Fsource-hello-world/pivot=00",
"reactorAddress": "http://localhost:9000",
"reactorToken": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJjYXAiOjI2MjE1NCwiZXhwIjoxNzI2NDU2NDIzLCJpYXQiOjE3MjY0NTM3ODIsImlzcyI6ImxvY2FsLWNsdXN0ZXIuZHAuZXN0dWFyeS1kYXRhLmNvbSIsInNlbCI6eyJpbmNsdWRlIjp7ImxhYmVscyI6W3sibmFtZSI6ImlkIiwidmFsdWUiOiJjYXB0dXJlL2pvaG5ueUNvL2hlbGxvL3NvdXJjZS1oZWxsby13b3JsZC8wZWI4MzE1YTkyMDBlYzAwLyIsInByZWZpeCI6dHJ1ZX1dfX0sInN1YiI6IjZmMzFiZTFhLWQ4NzUtNDUzZC05MDhlLTc4ZDRmMmVmMWUxZiIsInByZWZpeGVzIjpbImpvaG5ueUNvL2hlbGxvL3NvdXJjZS1oZWxsby13b3JsZCJdfQ.cD_LMSd_ycu-eUmAiOAyI2dK41ym6jaBly5XtvyHpjE",
"retryMillis": 0,
"shardIdPrefix": "capture/johnnyCo/hello/source-hello-world/0eb8315a9200ec00/"
} Calling $ curl -v -H "origin: http://localhost:3000" -H "authorization: Bearer $CONTROL_TOKEN" -H "content-type: application/json" --data-binary '{"collection":"johnnyCo/hello/events"}' http://localhost:8675/authorize/user/collection | jq
* Host localhost:8675 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying [::1]:8675...
* Connected to localhost (::1) port 8675
> POST /authorize/user/collection HTTP/1.1
> Host: localhost:8675
> User-Agent: curl/8.5.0
> Accept: */*
> origin: http://localhost:3000
> authorization: Bearer (trim)
> content-type: application/json
> Content-Length: 38
>
} [38 bytes data]
< HTTP/1.1 200 OK
< content-type: application/json
< content-length: 695
< vary: origin, access-control-request-method, access-control-request-headers
< access-control-allow-origin: http://localhost:3000
< date: Mon, 16 Sep 2024 02:31:22 GMT
<
{ [695 bytes data]
100 733 100 695 100 38 539k 30206 --:--:-- --:--:-- --:--:-- 715k
* Connection #0 to host localhost left intact
{
"brokerAddress": "http://localhost:8080",
"brokerToken": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJjYXAiOjEwLCJleHAiOjE3MjY0NTczMDIsImlhdCI6MTcyNjQ1Mzg4MiwiaXNzIjoibG9jYWwtY2x1c3Rlci5kcC5lc3R1YXJ5LWRhdGEuY29tIiwic2VsIjp7ImluY2x1ZGUiOnsibGFiZWxzIjpbeyJuYW1lIjoiZXN0dWFyeS5kZXYvY29sbGVjdGlvbiIsInZhbHVlIjoiam9obm55Q28vaGVsbG8vZXZlbnRzIn0seyJuYW1lIjoibmFtZSIsInZhbHVlIjoiam9obm55Q28vaGVsbG8vZXZlbnRzLzBlYjgzMTVhOTIwMGVjMDAvIiwicHJlZml4Ijp0cnVlfV19fSwic3ViIjoiNmYzMWJlMWEtZDg3NS00NTNkLTkwOGUtNzhkNGYyZWYxZTFmIiwicHJlZml4ZXMiOlsiam9obm55Q28vaGVsbG8vZXZlbnRzLzBlYjgzMTVhOTIwMGVjMDAvIl19.3Tl3VKOxGGq-W6-ODFPJd4ICIfg-j2DJrBACRe1g4LY",
"journalNamePrefix": "johnnyCo/hello/events/0eb8315a9200ec00/",
"retryMillis": 0
} Using $ curl -vv --insecure -H "Origin: http://localhost:3000" -H "authorization: Bearer $BROKER_TOKEN" --data-binary '{"selector":{"include":{"labels":[]},"exclude":{"labels": []}}}' http://localhost:8080/v1/journals/list
* Host localhost:8080 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
* Trying [::1]:8080...
* Connected to localhost (::1) port 8080
> POST /v1/journals/list HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/8.5.0
> Accept: */*
> Origin: http://localhost:3000
> authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJjYXAiOjEwLCJleHAiOjE3MjY0NTczMDIsImlhdCI6MTcyNjQ1Mzg4MiwiaXNzIjoibG9jYWwtY2x1c3Rlci5kcC5lc3R1YXJ5LWRhdGEuY29tIiwic2VsIjp7ImluY2x1ZGUiOnsibGFiZWxzIjpbeyJuYW1lIjoiZXN0dWFyeS5kZXYvY29sbGVjdGlvbiIsInZhbHVlIjoiam9obm55Q28vaGVsbG8vZXZlbnRzIn0seyJuYW1lIjoibmFtZSIsInZhbHVlIjoiam9obm55Q28vaGVsbG8vZXZlbnRzLzBlYjgzMTVhOTIwMGVjMDAvIiwicHJlZml4Ijp0cnVlfV19fSwic3ViIjoiNmYzMWJlMWEtZDg3NS00NTNkLTkwOGUtNzhkNGYyZWYxZTFmIiwicHJlZml4ZXMiOlsiam9obm55Q28vaGVsbG8vZXZlbnRzLzBlYjgzMTVhOTIwMGVjMDAvIl19.3Tl3VKOxGGq-W6-ODFPJd4ICIfg-j2DJrBACRe1g4LY
> Content-Length: 63
> Content-Type: application/x-www-form-urlencoded
>
< HTTP/1.1 200 OK
< Access-Control-Allow-Headers: Cache-Control, Content-Language, Content-Length, Content-Type, Expires, Last-Modified, Pragma, Authorization
< Access-Control-Allow-Methods: GET, POST, OPTIONS
< Access-Control-Allow-Origin: http://localhost:3000
< Content-Type: application/json
< Grpc-Metadata-Content-Type: application/grpc
< Date: Mon, 16 Sep 2024 02:35:03 GMT
< Transfer-Encoding: chunked
<
{"result":{"header":{"processId":{"zone":"local","suffix":"cool-terrapin"},"route":{"primary":-1},"etcd":{"clusterId":"14841639068965178418","memberId":"10276657743932975437","revision":"85","raftTerm":"2"}},"journals":[{"spec":{"name":"johnnyCo/hello/events/0eb8315a9200ec00/pivot=00","replication":3,"labels":{"labels":[{"name":"app.gazette.dev/managed-by","value":"estuary.dev/flow"},{"name":"content-type","value":"application/x-ndjson"},{"name":"estuary.dev/build","value":"0eb8315a9700019a"},{"name":"estuary.dev/collection","value":"johnnyCo/hello/events"},{"name":"estuary.dev/key-begin","value":"00000000"},{"name":"estuary.dev/key-end","value":"ffffffff"}]},"fragment":{"length":"536870912","compressionCodec":"GZIP","stores":["gs://estuary-trial/collection-data/"],"refreshInterval":"300s","pathPostfixTemplate":"utc_date={{.Spool.FirstAppendTime.Format \"2006-01-02\"}}/utc_hour={{.Spool.FirstAppendTime.Format \"15\"}}"},"flags":4,"maxAppendRate":"4194304"},"modRevision":"83","route":{"members":[{"zone":"local","suffix":"cool-terrapin"}],"endpoints":["http://localhost:8080"]},"createRevision":"83"}]}}
* Connection #0 to host localhost left intact |
03f0e76
to
ca30917
Compare
Track managed data-plane metadata that users care about. Also add an `enable_l2` toggle to toggle whether data-planes are included in L2 roll-ups.
Add `evaluate()` for evaluation of a generic policy, while encapsulating Snapshot refresh and retry semantics. We'll use this shortly for additional authorization APIs.
Add UserGrants table. Refactor RBAC search into a joint search that's generalized over both user and role grants.
…utes `/authorize/user/task` enables UI shard listings/status and retrieval of task logs, as well as access to private connector networking. `/authorize/user/collection` enables UI journal listing and data preview. Both offer temporary support for the current data-plane-gateway, which implements legacy authorization checks using claimed prefixes. Also introduce an address rewrite mechanism for mapping an internal data-plane legacy service address into the data-plane-gateway address in external call contexts. Issue #1627
Don't configure Router with a default service address. Instead, journal and shard Client instances are configured with a default service address and metadata which is _used_ by Router when picking a route. This makes it possible to cheaply clone Client instances and give each a different service address and authorization header, while still using the same underlying pool of gRPC connections.
…ions This change introduces the agent API to `flowctl`, which is the proverbial straw which motivated a deeper refactor of flowctl configuration. As a headline feature, `flowctl` supports the new task and collection authorization APIs and uses them in support of serving existing subcommands for reading collections, previews, and read ops logs or stats. Clean up management of access and refresh tokens by obtaining access tokens or generating refresh tokens prior to calling into a particular sub-command. Preserve the ability to run `flowctl` in an unauthenticated mode. Make it easier to use `flowctl` against a local stack by introducing alternative defaults when running under a "local" profile. Also fix handling of single-use refresh tokens, where we must retain the updated secret after using it to generate a new access token. We could now consider having `flowctl` create single-use refresh tokens rather than multi-use ones, but I didn't want to take that step just yet. Also fix mis-ordering of output when reading journals. Also fix OffsetNotYetAvailable error when reading a journal in non-blocking mode. Issue #1627
Move connector networking entirely into this repo, from the legacy data-plane-gatweay repo, and significantly retool it along the way to: * Improve latency and throughput of HTTP reverse-proxy cases, by allowing the reverse proxy to use multiple pooled connections built atop network proxy RPCs with reasonable idle timeouts. This improves concurrency as many HTTP/2 requests can be in flight at once, and improves latency to the end user by ammortizing connections to reduce aggregate TCP and TLS startup time. * Improve user-facing error experience around misconfigurations, by often assuming an HTTP protocol and yielding a more informative error. * Overhauling metrics that we collect. * Updating the authorization flow, laying groundwork for the UI to use the /authorize/user/task API (but not requiring it just yet).
ca30917
to
129c5d4
Compare
Bring in Gazette updates for grpc-web gateways and net.Listener customization. Deeply rework Tiltfile to remove data-plane-gateway, by creating a self-signed TLS CA and Certificate that are used by the broker and reactor (and may be used by other services if desired). Use a naming strategy of `thing.flow.localhost`, because many libraries like rustls don't accept wildcard certs of a toplevel `*.localhost` but will happily accept `*.flow.localhost`. Update for changes to --broker.allow-origin, --consumer.allow-origin, and --flow.dashboard, as well as extracting explicit arguments into environment variables.
129c5d4
to
b1bbdde
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, and good riddance to DPG! 🪦
alter table data_planes add column cidr_blocks cidr[] not null default '{}'; | ||
alter table data_planes add column enable_l2 boolean not null default false; | ||
alter table data_planes add column gcp_service_account_email text; | ||
alter table data_planes add column ssh_private_key text; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: These could use some comment on column
s
|
||
alter table data_planes add column aws_iam_user_arn text; | ||
alter table data_planes add column cidr_blocks cidr[] not null default '{}'; | ||
alter table data_planes add column enable_l2 boolean not null default false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for excluding a data plane from the L2 rollups? Is this for increased data privacy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed in VC, but this is to be able to temporarily turn off data planes without it breaking L2 reporting.
axum::Extension(claims): axum::Extension<super::ControlClaims>, | ||
super::Request(request): super::Request<Request>, | ||
) -> axum::response::Response { | ||
super::wrap(async move { do_authorize_user_task(&app, claims, request).await }).await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally fine to merge as is, but I wanted to mention this because I'd seen a comment you'd left about it: Axum lets you impl IntoResponse
for an error type, which allows your handler to just return a Result<T, E>
where T
and E
both implement IntoResponse
. They have an example of this here
pub user_refresh_token: Option<RefreshToken>, | ||
|
||
#[serde(skip)] | ||
is_local: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional, just tossing out the idea: Instead of all the matches in the get_*
functions, it might be a bit cleaner to have a struct like
struct ProfileDefaults {
agent_url: Url,
dashboard_url: Url,
...
}
and
#[serde(skip)]
profile_defaults: ProfileDefaults
} | ||
cancel() // We don't use the returned context. | ||
|
||
/* TODO(johnny): Inspect claims once UI is updated to use /authorize/user/task API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to confirm my uderstanding is that we won't actually verify authZ at this stage, but it won't actually allow unauthorized access because this endpoint can't be reached in our current public data plane. So we'll need to update the UI to use the new authorize api, and then enforce authZ here before setting up any of the new private data planes. That sound right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep that's right
userHandledCounter.WithLabelValues(task, port, proto, "OK").Inc() | ||
} | ||
|
||
func (f *Frontend) serveConnErr(conn net.Conn, status int, body string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😎
} else if len(shards) == 0 { | ||
conn.sniErr = errors.New("the requested subdomain does not match a known task and port combination") | ||
} else { | ||
conn.resolved = newResolvedSNI(conn.parsed, &shards[0].Spec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine for now, but it might not be ideal to always pick the first shard in cases where there's multiple shards and the hostname doesn't specify a key+rclock. It seems better to pick on randomly in order to distribute the load. I don't feel super confident that random selection would be ideal either, but wanted to raise the idea since splitting source-http-ingest
shards had been on my mind lately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not used to dial a shard, it's only used to fetch task metadata, and any shard will do -- we're just plucking out the shard ID , port protocol, and whether it's public.
We list shards again and randomize them when picking a primary to dial.
Description:
/authorize/user/task
and/authorize/user/collection
APIs for authorizing and retrieving addresses and access tokens for interacting with the hosting data-plane of a task or collection.user_grants
into authorization snapshots and refactor RBAC evaluationflowctl
to make it aware of the control-plane API and to use the new authorization APIs.Issue #1627
Workflow steps:
(How does one use this feature, and how has it changed)
Documentation links affected:
(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)
Notes for reviewers:
(anything that might help someone review this PR)
This change is