Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate using Amazon Aurora MySQLdb #59

Open
2 tasks
inodb opened this issue Mar 10, 2020 · 1 comment
Open
2 tasks

Investigate using Amazon Aurora MySQLdb #59

inodb opened this issue Mar 10, 2020 · 1 comment

Comments

@inodb
Copy link
Contributor

inodb commented Mar 10, 2020

While trying to up the database resources for a workshop without downtime I gave AWS Aurora MySQL a try. It worked pretty well

One can set up the aurora replicas to read from the already existing cbioportal public db instance. Takes about an hour or so to start. Make sure to allow connections from the kubernetes VPC. Then you can connect by running a mysql client on the k8s cluster:

kubectl run --rm -i --tty mysql-client --image=mysql:5.7 --restart=Never -- sh

And connect to the read endpoint that Amazon gives you. All db settings are copied so can log in with same things as usual.

To connect the cBioPortal pods one can simply change the DB_HOST to point to the aurora instance.

To really use this in production there were a few issues:

  • I got a 500 error when using the default mysql configuration and doing a TP53 query on all curated studies. Might have to copy some values from here to support big queries: https://github.com/knowledgesystems/knowledgesystems-k8s-deployment/blob/master/cbioportal/cbioportal_mysql_db_values.yml#L12-L25
  • One can only connect to Aurora from an EC2 machine in the same cluster. So it's not possible to directly connect to some mysql string. You can spin up EC2 instance and forward though. Should also work thru kubernetes cluster forwarding with some one liner. But one can see that we have to hash out some issues here with importing. Another option would be to leave the existing RDS instance running for now and only use it for writing by cbio_importer. Reading will be done thru the Aurora cluster. Then at some point we might switchover to use the aurora write endpoint instead
@inodb
Copy link
Contributor Author

inodb commented Apr 13, 2020

I did find a better solution now for deploying a workshop db with more resources. One can just restore a snapshot from the day before to a new database instance.

  • Go to RDS page, click on snapshots, click on System, click on action > restore
  • Select a big database, i usually double it: db.r5.4xlarge
  • set security group is to allow connections from anywhere (or the k8s nodes). rds-launchwizard-5 seems to work. Prolly good to rename. I guess anything that allows 3306 access
  • mysql conf is the same as production db (that is it allows bigger packets etc). It's called cbioportla-mysql-conf or something

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant