Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Docker setup #186

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ChlodAlejandro
Copy link
Contributor

Re-add ability to use Docker for development post-switch to Cloud VPS. Support for Trove has been added, along with a few tweaks (like moving execution to a non-root user, as required by Symfony). Also moved off of Toolforge images and onto global php images, since CopyPatrol isn't expected to run on Toolforge anymore.

This runs the same way as it did before: the main CopyPatrol Symfony server executes as the main program in a Docker Compose set, and the SSH tunnel is established from within that container with a separate terminal. See new README for more info.

I've done the prep work to allow the image to also run standalone, should that ever be needed. Some more optimization is in order; currently, the "production" image stands at 643 MB because it's built on top of the heavyweight php:8.2 image which uses Debian. That can be done in the future, if CopyPatrol is ever put onto Kubernetes or something.

@ChlodAlejandro ChlodAlejandro force-pushed the chlod/wmcs-docker branch 2 times, most recently from baf1cc0 to a5c866e Compare August 15, 2024 20:27
README.md Show resolved Hide resolved
@ChlodAlejandro ChlodAlejandro force-pushed the chlod/wmcs-docker branch 2 times, most recently from 8ed0452 to a7bb6cc Compare August 22, 2024 06:08
@MusikAnimal
Copy link
Member

Sorry for taking so long to get back to this. It's not working for me :( The docker compose up command hangs with Waiting for TROVE SQL port (4721) to open...

It looks like docker compose exec copypatrol start ssh is running:

sudo -u copypatrol-ssh symfony console toolforge:ssh --trove=127.0.0.1 -b 127.0.0.1 copypatrol-ssh

Is that correct? I would have expected the remote host name. I'm guessing the Docker container is supposed to make that remote connection, but I can't verify that as I don't have ps, top etc. in the container.

It's very possible I'm merely doing something stupid on my end. I'm happy to just merge if you believe things are working as expected.

@ChlodAlejandro
Copy link
Contributor Author

ChlodAlejandro commented Sep 1, 2024

Looks like a bit of an oversight on my part. The Trove host to be used is set by the TROVE_HOST environment variable on the shell which brought the container up, regardless of the value of .env.local. By default, it's set to hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud, but if it's been set to 127.0.0.1, it'll use that instead.

I've added a TROVE_REMOTE_HOST variable in .env to indicate which host toolforge:ssh should connect to. This should decouple Docker from using TROVE_HOST when switching between a bare-metal/Docker setup. TROVE_HOST should still be kept at 127.0.0.1. After setting this variable, can you check if it works?

n.b. You can get a shell into the container with docker compose exec copypatrol start bash. From here, you can apt update && apt install whatever you'd like to help you debug what's going on from within the container. You can also run docker compose exec copypatrol start ssh -vvv to get debug logging for the SSH client.

@MusikAnimal
Copy link
Member

Still no dice :(

You can get a shell into the container with docker compose exec copypatrol start bash. From here, you can apt update && apt install whatever you'd like to help you debug what's going on from within the container.

My limited Docker skills had already got me that far, but I guess I didn't know where to install ps! After adding the procps package, here's what I get looking for SSH connections (maybe a better way to do this, but I think this still pick up everything relevant):

root@229296c46c4e:/app# ps -ef | grep ssh
root          33       0  0 21:37 pts/1    00:00:00 sudo -u copypatrol-ssh symfony console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud -b 127.0.0.1 copypatrol-ssh
root          50      33  0 21:37 pts/2    00:00:00 sudo -u copypatrol-ssh symfony console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud -b 127.0.0.1 copypatrol-ssh
copypat+      51      50  0 21:37 pts/2    00:00:00 symfony console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud -b 127.0.0.1 copypatrol-ssh
copypat+      56      51  0 21:37 pts/2    00:00:00 /usr/local/bin/php bin/console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud -b 127.0.0.1 copypatrol-ssh

And on my host machine:

musikan+   14832  0.0  0.0   8100  5760 ?        S    Aug26   0:00 /usr/bin/ssh-agent -D -a /run/user/1000/keyring/.ssh
musikan+  886549  0.0  0.0 1994952 25600 pts/6   Sl+  17:37   0:00 docker compose exec copypatrol start ssh
musikan+  886573  0.0  0.0 2317388 44288 pts/6   Sl+  17:37   0:00 /usr/libexec/docker/cli-plugins/docker-compose compose exec copypatrol start ssh
root      886604  0.0  0.0   5256  3456 pts/1    Ss+  17:37   0:00 sudo -u copypatrol-ssh symfony console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud -b 127.0.0.1 copypatrol-ssh
root      886621  0.0  0.0   5256  1556 pts/2    Ss   17:37   0:00 sudo -u copypatrol-ssh symfony console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud -b 127.0.0.1 copypatrol-ssh
998       886622  0.0  0.0 1238884 13056 pts/2   Sl+  17:37   0:00 symfony console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud -b 127.0.0.1 copypatrol-ssh
998       886627  0.0  0.0  85820 26752 pts/2    S+   17:37   0:00 /usr/local/bin/php bin/console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud -b 127.0.0.1 copypatrol-ssh

Neither of these show the expected SSH tunneling on ports 4711-4718 and 4721.

Comparing with using the toolforge:ssh command sans Docker, ala symfony console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud and we have:

musikan+  889445  0.2  0.0 723784 13824 pts/6    Sl+  17:42   0:00 symfony console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud
musikan+  889454 10.0  0.1 204872 70832 pts/6    S+   17:42   0:00 /usr/bin/php8.1 bin/console toolforge:ssh --trove=hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud
musikan+  889467  0.0  0.0   2896  1664 pts/6    S+   17:42   0:00 sh -c ssh -N -L 4711:s1.web.db.svc.eqiad.wmflabs:3306 -L 4712:s2.web.db.svc.eqiad.wmflabs:3306 -L 4713:s3.web.db.svc.eqiad.wmflabs:3306 -L 4714:s4.web.db.svc.eqiad.wmflabs:3306 -L 4715:s5.web.db.svc.eqiad.wmflabs:3306 -L 4716:s6.web.db.svc.eqiad.wmflabs:3306 -L 4717:s7.web.db.svc.eqiad.wmflabs:3306 -L 4718:s8.web.db.svc.eqiad.wmflabs:3306 -L 4721:hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud:3306 login.toolforge.org
musikan+  889468  0.7  0.0  17328  8576 pts/6    S+   17:42   0:00 ssh -N -L 4711:s1.web.db.svc.eqiad.wmflabs:3306 -L 4712:s2.web.db.svc.eqiad.wmflabs:3306 -L 4713:s3.web.db.svc.eqiad.wmflabs:3306 -L 4714:s4.web.db.svc.eqiad.wmflabs:3306 -L 4715:s5.web.db.svc.eqiad.wmflabs:3306 -L 4716:s6.web.db.svc.eqiad.wmflabs:3306 -L 4717:s7.web.db.svc.eqiad.wmflabs:3306 -L 4718:s8.web.db.svc.eqiad.wmflabs:3306 -L 4721:hxmnwriu2vm.svc.trove.eqiad1.wikimedia.cloud:3306 login.toolforge.org

Is there some magic going on, or shouldn't the Docker setup also make use of SSH port forwarding?

At any rate, my offer still stands: If you believe it's working as expected or I'm doing something dumb or both, I'll just merge :)

Re-add ability to use Docker for development post-switch
to Cloud VPS. Support for Trove has been added, along with
a few tweaks (like moving execution to a non-root user,
as required by Symfony).

Also moved off of Toolforge images and onto global `php`
images, since CopyPatrol isn't expected to run on Toolforge
anymore.
@ChlodAlejandro
Copy link
Contributor Author

ChlodAlejandro commented Sep 8, 2024

@MusikAnimal It appears my approach here only ever worked because I had a User chlod line somewhere in my SSH config, which allowed SSH to pull my username specifically for Toolforge. This assurance doesn't exist for other users, which seems to be causing the issue here (notice how you're trying to log into Toolforge as copypatrol-ssh, which is the name of the internal user handling the SSH connection).

I've removed the part of the entrypoint script which attempts to gather the username with the SSH config. This means SSH will handle usernames, if they appear in the configuration. If they don't, the user will have to provide the username manually, since info on who's running the docker compose exec command is not provided to the container. I've written that in the README.

Moving forward, the command you should use is:

docker compose exec copypatrol start ssh musikanimal

which should eventually later expand to

symfony console toolforge:ssh --trove="$TROVE_HOST" -b 127.0.0.1 musikanimal

This assumes musikanimal is your shell name.

I think it's a good thing that we're catching these issues prior to merging so that other developers down the line of varying configuration setups and OSes don't have them! As long as you're alright with trying to make this work, I'm also happy to work it out with you. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants