Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implementation of AWS OIDC #232

Closed
wants to merge 15 commits into from
Closed

feat: implementation of AWS OIDC #232

wants to merge 15 commits into from

Conversation

aabchoo
Copy link
Contributor

@aabchoo aabchoo commented Jan 30, 2025

Commit Message

This implements the API for AWS OIDC within the extproc.

The main changes breaks down into:

  1. OIDC Handler which can be used for any providers
  2. AWS handles calling STS for temporary credentials
  3. Passes Proxy + configmap CERTs

Related Issues/PRs (if applicable)

Related PR: #43

Special notes for reviewers (if applicable)

Need to add tests

Signed-off-by: Aaron Choo <[email protected]>
Signed-off-by: Aaron Choo <[email protected]>
Signed-off-by: Aaron Choo <[email protected]>
Comment on lines 107 to 108
if a.oidcHandler != nil {
err := a.oidcHandler.updateOIDCTokenIfExpired(context.TODO())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you have to have some sync mechanism as a handler is hared across multiple goroutines/requests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one idea is to do this upgrades on the background groutine and on the request path do not make any blocking calls like this if block right now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm fair, I wanted to put it within the request path to avoid unnecessary queries to the SSO/STS. But the race condition is also unideal, so I'll add a goroutine to handle updating creds

@mathetake
Copy link
Member

fyi i renamed filterconfig to filterapi so the conflict is because of that 🙇

@aabchoo aabchoo changed the title Implementation of AWS OIDC feat: Implementation of AWS OIDC Jan 30, 2025
Signed-off-by: Aaron Choo <[email protected]>
Copy link
Member

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level question: it seems better to me that we should do the credentials update inside the controller and store the exchanged credentials in a controller-managed secret. Then I guess no change would be needed to extproc? Not only that, then the controller pod (one pod) will be the only one making requests to sts vs with the current impl, it will be tens/hundreds of pods making requests to sts?

@missBerg
Copy link
Contributor

I agree with @mathetake; the code to rotate and refresh should happen in the controller, and the ext proc should use the credentials.

@mathetake
Copy link
Member

Another great advantages is that you won't need these certificates API then - you have full control over the deployment of controller.

@aabchoo
Copy link
Contributor Author

aabchoo commented Jan 30, 2025

I believe the reason we didn't put the OIDC code within the controller is because we need our certs mounted to the pod calling our SSO and we thought it would be better not share the cert with the entire controller.

cc @yuzisun could you confirm

@aabchoo aabchoo changed the title feat: Implementation of AWS OIDC feat: implementation of AWS OIDC Jan 30, 2025
@mathetake
Copy link
Member

mathetake commented Jan 30, 2025

we need our certs mounted to the pod calling our SSO and we thought it would be better not share the cert with the entire controller.

ok so what you are saying is

  • Mounting the certificate with the "less privileged" exptorc (many containers, tens/hundreds, almost zero privilege) is OK
  • Mounting the certificate with the "highly privileged" controller (only exists one container at a time usually) is Not OK

Controller is already watching all secrets (by default though), so I don't think that makes sense to have this complex API/configurations. It can already have the enough privilege to read/write/delete the certificate.

@aabchoo
Copy link
Contributor Author

aabchoo commented Jan 30, 2025

Discussed with @yuzisun offline, and it doesn't seem to be an issue. I will rework the OIDC changes to be apart of the controller. Will aim to get it in by tonight 🙇

@mathetake
Copy link
Member

great! thank you for the work anyways!

Copy link
Contributor

@missBerg missBerg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a bit more of a look, but there are definitely things to address :)

type IdentityTokenValue string

// GetIdentityToken retrieves the JWT and returns the contents as a []byte
func (j IdentityTokenValue) GetIdentityToken() ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider renaming function to something like IdentityTokenAsByteArray to make it more accurate for what the function does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetIdentityToken is the official function for IdentityTokenRetriever's interface

// IdentityTokenRetriever is an interface for retrieving a JWT
type IdentityTokenRetriever interface {
	GetIdentityToken() ([]byte, error)
}

But I do agree it's not very clear

expTime time.Time
}

type Handler struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since Handler has meaning in Go and becomes vague in this context. Consider renaming to be more specific, consider something descriptive like TokenRefresher

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TokenRefresher makes sense if the "handler" only managed refreshing OIDC tokens. What do you think about CredentialRefresher since it updates the OIDC token and updates/creates a secret file with new logins

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does both token exchange and refresh, we discussed to have a generic interface to support both aws and gcp which we are going to add very soon. The steps are the same to get the oidc token to exchange the cloud credential, then update the credential file. The token refresher go routine updates the credential before it expires.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect several of these functions do exist in the JWT or OIDC libraries - please take a look

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which specific functions do you suspect? I might be missing something

return t, nil
}

func (o *Handler) oauth2TokenExpireTime(accessToken *oauth2.Token) (*time.Time, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a lot of casting in this function, which seems a bit excessive and unnecessary. Can we get an explanation of why this is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for the two castings:

  1. We set the "implementation" of Claim to be type MapClaims when parsing the access token via ParseUnverified. We later cast it because we want to use its properties of being a map (we essentially are checking the type this way).
  2. The value of claims is an interface, so we want to type cast to an integer for time.Unix to understand it. This is essentially checking the types + casting all at once.

return handler, nil
}

func (o *Handler) UpdateCredentials(ctx context.Context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an infite loop, could we spawn a process instead? If we spawn it as a process it can be terminated and should make the code cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming you're referring to this process?

Correct me if I'm misunderstanding you, but would this mean we create an executable for the OIDC credential update and have the process handle updating?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple go routine based timer should work, we do not need a process.

"time"

"github.com/coreos/go-oidc/v3/oidc"
egv1a1 "github.com/envoyproxy/gateway/api/v1alpha1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming this seems to lose the readability of the code. is it necessary?
@mathetake is this a pattern used across the code base?

Copy link
Member

@mathetake mathetake Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and this is the standard pattern and naming consistency is ensured by the linter in CI. Btw, this is not only here but also across the whole bunch of k8s related projects. you see the API package name v1alpha1 exists across tons of projects so not renaming is terrible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not rename to egv1alpha1?

o.logger.Error(err, "Failed to get backend security policies")
}

for _, backendSecurityPolicy := range backendSecurityPolicies.Items {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider making this a separate function that is called if the backendSecurityPolicies is set. putting this for loop a callable function instead will make it more testable and self contained.


cacheKey := fmt.Sprintf("%s.%s", backendSecurityPolicy.Name, backendSecurityPolicy.Namespace)
awsCredentials := backendSecurityPolicy.Spec.AWSCredentials
if o.oidcCredentialCache[cacheKey].token == nil || time.Now().After(o.oidcCredentialCache[cacheKey].expTime.Add(-5*time.Minute)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider making this logic check a function, also consider making values like -5*time.Minute a constant somewhere, as the 5 minutes before expiry should be easily to semantically understand somewhere in the code.

return credentialsCache.Retrieve(context.TODO())
}

func updateOrCreateAWSSecret(k8sClient client.Client, credentials aws.Credentials, namespace, bspKey string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make the functions small and do one thing.
Having a function name with Or in it is a sign something isn't right ;)

api/v1alpha1/api.go Outdated Show resolved Hide resolved
mathetake added a commit that referenced this pull request Jan 31, 2025
**Commit Message**:

This adds the goroutine leak detectors in
controller related tests. Other packages are 
not spawning goroutines so this only enables 
it for the controller testing.

**Related Issues/PRs (if applicable)**:

This will help ensure the quality of #232

---------

Signed-off-by: Takeshi Yoneda <[email protected]>

const OidcAwsPrefix = "oidc-aws-"

type AWSSpec struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is a bit unclear of the purpose, maybe we should name the other struct AWSCredentialHandler and this one AWSCredentialExchange

Copy link
Member

@mathetake mathetake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look into the detail but can't this be a part of existing backendSecurityPolicyController? Even though I don't have a concrete answer, I think there must be a clear way to do that. the current structure of your code seems totally different than the existing style and this doesn't feel right (but again i am just skimming the code)

Comment on lines +120 to +127
// Have a token exchange handler per provider type.
handler, err := oidc.NewOIDCTokenExchange(&logger, c, aigv1a1.BackendSecurityPolicyTypeAWSCredentials)
if err != nil {
return fmt.Errorf("failed to create OIDC handler: %w", err)
}

go handler.RefreshCredentials(ctx)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these code are not called since mgr.Start is blocking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we write a reconciler for BackendSecurityPolicyTypeAWSCredentials. The reconcile loop can be trigger can time tick event. In the reconcile loop you can fresh the tokens.

  • write a event ticket soruce
  • only watch generic event
, err := ctrl.NewControllerManagedBy(mgr).
		For(&igv1a1.BackendSecurityPolicyTypeAWSCredentials{}}).
		//we filter out create/delete/update event
		//reconciliation is only triggered by a GenericEvent generated by a time ticker
		WithEventFilter(predicate.Funcs
			GenericFunc: func(event.GenericEvent) bool { return true },
  • refresh the token in the reconcile loop

If we move the auth logic in the controller, maybe let us follow controller-runtime convention ?

@aabchoo aabchoo closed this Feb 6, 2025
@mathetake mathetake deleted the aaron/oidc-embedded branch February 6, 2025 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants