Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to utilize new multi-tenancy approach #118

Closed
jordanpadams opened this issue Mar 30, 2023 · 11 comments · Fixed by #146 or #152
Closed

Update to utilize new multi-tenancy approach #118

jordanpadams opened this issue Mar 30, 2023 · 11 comments · Fixed by #146 or #152

Comments

@jordanpadams
Copy link
Member

💡 Description

  • handshake auth to only access data owned by the node
  • update documentation as needed
@sjoshi-jpl
Copy link

Modify Harvest to:

  1. Get Cognito JWT tokens (for write access to OpenSearch)
  2. Get signed URLs (call API Gateway path for GET /signed-url)
  3. For write access to OpenSearch with Signed-URLs (directly calling OpenSearch with signed-url) for read-only access to OpenSearch (call API Gateway path for GET /registry)

@al-niessner
Copy link
Contributor

@jordanpadams @sjoshi-jpl

Is there a cognito tutorial for us somewhere? Just the readme of some repo maybe?

@al-niessner
Copy link
Contributor

@alexdunnjpl @jordanpadams @tloubrieu-jpl

Looking for opinions. Currently, harvest splits up server information. In the harvest config, provide registry URL, index, and auth file. In auth file provide trust self signed and username/password. Trusting self signed is a server problem like the URL not auth information. Also, harvest uses a sax parser to read the config file making it difficult to understand what is legal and illegal in the file.

There are several theories of approach here:

  1. do least change
  2. minimize change while maximize consistency
  3. do it the right way

least change

So, least change would mean shoving clientID and urlIsCognito into auth file. Already had some discontent at last breakout over that. Not a huge fan of it myself because now you have repeated information for Nick and Nora who just care about their username/password and all the rest is meaningless. The auth file would look like:

trust.self-signed = true
clientID = a123bajkhfdkjalkjalj212lsdlkjsgjkl
urlIsCognito = false
# TODO Warning: Use the default username and password only for testing purposes in local setup
user = nickandnora
password = asta

minimal change but consistent

Here, group the server information in harvest config and make auth just username/password:

user = nickandnora
password = asta

However this pushes three new items into the harvest config. Since it is a sax parser it will unduly complicate the code and provide no explanation of what is possible. In turn, all the examples would have to be expanded and expounded to cover the new options. It would looks something like:

<registry url="https://es:9200" 
               index="registry"
               clientID=akjdsajfdkjaklfoiwi"
               trust.self_signed="false"
               urlIsCognito="true"
               auth="auth.cfg" />

right way

First, write a schema for harvest config. It would have docs to explain the options. It would allow for xmllint to determine if the config is valid and tell the user where the error is. Since the user base is PDS, they should be familiar enough with XML and schema to use it effectively.

Second, use JAXB and get out of the XML parsing business. Is much more secure with respect to injection and it also validates the XML as reading.

Lastly, auth file would be username/password only. The line would look more like:

<registry url="https://es:9200" 
               index="registry"
               clientID=akjdsajfdkjaklfoiwi"
               trust.self_signed="false"
               style="direct" --> schema would limit to direct or cognito for now
               auth="auth.cfg" />

Obviously I prefer the last. When I had to write my own harvest from scratch I had to read java code and was really annoyed that there was no schema. It will take a little bit of time (days) to do schema and stuff but will pay dividends over time.

Thoughts?

@alexdunnjpl
Copy link
Contributor

Firm default-vote for "right way".

@jordanpadams
Copy link
Member Author

@al-niessner +1 to "right way". We have gotten complaints about not having a schema file in the past for this. They have some schema generators online you can use to get a first pass at what that could be as well.

@tloubrieu-jpl
Copy link
Member

@al-niessner is looking at the xsd schema that requires update for the cognito configuration in the harvest configuration file.

@tloubrieu-jpl
Copy link
Member

The schema is now updated and is backward compatible. @tloubrieu-jpl can review the new schema available in the draft PR.

@al-niessner
Copy link
Contributor

#146 was only start to solving this ticket.

@al-niessner al-niessner reopened this Jan 25, 2024
@jordanpadams
Copy link
Member Author

@al-niessner status: Got a response from Cognito team. Waiting for @sjoshi-jpl to make the happenings happen.

@tloubrieu-jpl
Copy link
Member

Task for @tloubrieu-jpl : write an updated documentation for Gary.

Task plan for @gxtchen :

  • build harvest and registry-manager with registry-common from branches:
    • harvest issue_118.1
    • registry-manager issue_66
    • registry-common issue_36
  • test plan:
    • find out tests which need to be done, from the documentation https://nasa-pds.github.io/registry/admin/tasks.html, user tasks and admin tasks), make it a test plan in test rail for non regression
    • otherwise nominal tests:
      - create registry (registry-mgr)
      - update data dictionaries (registry-mgr)
      - load data (harvest), multiple job configurations, load bundle, directory...
      - update archive status (registry-mgr)
  • test venues:
    • AWS MCP Dev account
    • Docker compose local deployment.

@tloubrieu-jpl
Copy link
Member

@gxtchen as a note, after we met I also thought, we also need to think of how to automate the tests that you are going to write for harvest and registry-manager. That can be done in the docker compose set-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment