-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move worker native environment setup into a privileged task that runs on a base image with worker installed #122
Comments
How would this work in a redeployability scenario? That is, how would Related, Gecko is moving toward defining the workerTypes (and all sorts of roles, hooks, and so on) using information in the ci-admin and ci-configuration repositories, so rather than this process editing workerType definitions in the provisioner directly, it would be better for it to output information to ci-configuration, where it can then be applied using the usual ci-admin process. |
Some discussion about this happened in mozilla-releng/releng-rfcs#47, where an alternate (or perhaps precursor) idea is being talked about. I think there are a couple of points that are worth bringing over here explicitly, since they relate to this idea. A comment from me, replying to another comment, expressing concern about adding image building to the critical path of builds/shipping:
And a note from @moz-hwine about security boundaries changing:
|
(This is not meant to be stop energy, I just want to make sure that crucial points aren't missed as we continue talking about and planning image building improvements.) |
So I woke up with jetlag with this crazy idea in my head.
Currently our AMI generation process for generic-worker relies on mechanics provided by EC2 to bootstrap our instances. We have some magic to get logs from this process into taskcluster logs, and it is tricky for the process which snapshots the instance to produce the AMI to know whether the installation steps were successful or not. Also the code to set up the environment needs to install the worker itself, which means it is possible for this to go wrong and to produce AMIs for workers that don't actually have the worker installed and functioning on them.
This idea I had was pretty vague, but I'm dumping it here initially so that I can iterate on it, and others can join in with the conversation if they wish.
Imagine that instead we would bootstrap base images of a particular OS with generic-worker. When Bug 1439588 - Add feature to support running Windows tasks in an elevated process (Administrator) lands, we could introduce a mechanism that, given necessary scopes, a user could submit a job which installs packages / performs environment setup in a task directly as Administrator, and then cloud-specific mechanism that snapshots the instance and produces an image for the worker type. This could be used as a mechanism for people to customise the worker types.
The workflow could look something like this:
"features": [ "createWorkerType" ]
) in the task payload ("scopes": ["generic-worker:create-worker-type:<provisioner>/<workerType>"]
).I'm not 100% sure about all this at the moment, this is very much a brainstorming exercise around the idea, but the objectives I was trying to achieve were:
The text was updated successfully, but these errors were encountered: