Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IAM authorizer for the API endpoint #377

Closed
brainstorm opened this issue Dec 20, 2021 · 10 comments · Fixed by #378
Closed

IAM authorizer for the API endpoint #377

brainstorm opened this issue Dec 20, 2021 · 10 comments · Fixed by #378
Assignees
Labels
Milestone

Comments

@brainstorm
Copy link
Member

Currently the API is consumed via temporal API Tokens and this works great! .. For one off explorations.

When using notebooks and other internal systems on AWS (i.e Sagemaker), it'd be great if we could have IAM auth'd requests to the same endpoint. That would guarantee that analysis (and other data explorations within AWS) and also future microservices leverage AWS's IAM directly instead of extra logic around bearer tokens (management of auto renewal & co).

@brainstorm
Copy link
Member Author

Previous discussion alternatives/points to research and implement:

(...) it is between Cognito and API Gateway. last time, I checked, AWS API Gateway does not allow stack up multi authorizer on the same endpoint. I will have to double that if that limit still imposed. Another work around is, to use another endpoint base url (API Gateway deploy stage) with built-in IAM_Authorizer; that should do it!

@andrewpatto
Copy link
Member

They can't be stacked on the same route - so we need an alternative endpoint - either route based, or endpoint based.

So assuming we leave the existing endpoints/gateway in place unchanged

https://api.data.prod.umccr.org/lims/2866

What is preferrable to add in to support IAM.

https://api-native.data.prod.umccr.org/lims/2866

or

https://api.data.prod.umccr.org/**iam**/lims/2866

The first allows common swagger docs and shared code (i.e the API paths don't change) - the only thing different between scripts in IAM mode v JWT mode would be the specification of the base url. But it results in two entirely different API gateways (cost? possibility of config drift?)

The second is a smaller change in that we just add the lambdas a second time on a new route - with people having to know the magic prefix in their routes for IAM usage..

@victorskl

@andrewpatto andrewpatto self-assigned this Jan 11, 2022
@brainstorm
Copy link
Member Author

The magic IAM path option seems like the most reasonable option (to me), but I totally defer to Victor's opinion on this.

@reisingerf
Copy link
Member

Is there no option to use Cognito? It should support SSO and the translation between IAM and tokens I thought?
That may result in a custom authorizer, but perhaps worth a bit of investigation?
(haven't been involved much in this, so may be way off here)

@victorskl
Copy link
Member

They can't be stacked on the same route

Ah well (sigh). Thanks for checking it out. They are still not supporting it...

Happy to adopt either option, TBH. As long as it supports the use case (i.e. SageMaker & those analytics scenarios, etc...). I still wonder; will SageMaker be able to query endpoint with AWS_IAM authorizer? i.e. SageMaker (or any of that caller client) need to do Signature 4 signing. If SageMaker does it internally, then that would work...

@victorskl victorskl added the feature New feature label Jan 11, 2022
@victorskl victorskl added this to the Release 0.8.0 milestone Jan 11, 2022
@andrewpatto
Copy link
Member

Yes @victorskl - Sagemaker code would need to be doing v4 signing on any api calls (we can provide an example Requests client for the sagemaker notebook that does that).. or people would need to use awscurl etc from within an EC2 instance. But it would then work long term (with no need for the client to manage token fetch/refresh). I think this was the use case that @brainstorm was describing.

@reisingerf It's possible Cognito can play a role here - but I'm not sure how that matches up with the use case from Roman. IAM creds are the only thing that services within aws are going to have unless we can somehow do cognito login processes inside Sagemaker? Would that be preferrable - make no changes to the API endpoint but write some scripts for sagemaker that simulate cognito/oauth flows inside sagemaker? We'd still end up with a bearer token that needs to be passed in and refreshed - and that was what Roman was trying to avoid?

@andrewpatto
Copy link
Member

Also, the concept of user identity available in the api lambdas will surely be different between a JWT authed request and an IAM authed request. Is there anything in particular to look out for re: caller identity I should look at. Are there internal API authorisations/roles etc? Or is the caller identity really just logged and ignored?

@reisingerf
Copy link
Member

Hm, I see.
Initially I thought Roman was referring to our AWS SSO sign in and interactive/exploratory use of Notebooks/Sagemaker (via the AWS console), which would still be linked to individual real users so would fit the token approach.

However, if automated services are the clients, then that's another story, yes. There I agree with an additional route/path.

@andrewpatto
Copy link
Member

I was with you - I thought initially he meant to somehow hook into the SSO sign in/interactive of notebooks. But the comments at the start of this tickets made me think he meant the other - @brainstorm can you clarify?

@brainstorm
Copy link
Member Author

Actually, I was thinking of having both:

  1. Some way to reuse the SSO creds from individual users on Sagemaker/Jupyter-Lab so that folks don't have to copy&paste perishable tokens from the portal.

  2. Having a service user that can query stuff against Athena and the portal for i.e Dashboarding.

Makes sense?

@victorskl victorskl linked a pull request Jan 19, 2022 that will close this issue
victorskl added a commit that referenced this issue Mar 9, 2022
* This effectively support without needing setup PORTAL_TOKEN
  environment variable but using AWS CLI credential or IAM role.
* Updated README and a couple of examples for possible backend
  and end user ad-hoc use case code snippet.
* R example is still using Python for v4 signing facility and http
  `requests` package; through `reticulate` R library. This can be
  improved to pure R with `httr` and `cloudyr`.
* Related to #415 #377
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants