Replies: 22 comments 38 replies
-
I would like to present my use case. To easy things on
Please note that for "Setup" parts we actually do not need any syncing, however for the "Multihost test" part we need to have One additional note on syncing. On the mtg there was an example:
In this example the need of syncing is not obvious. In fact, such a need should be stated explicitly. I think that the default "sync" behavior could assume that tests with identical names should be synced but that won't apply to
This idea could be extended further where different "groups" can be defined first and then later referenced by specific tests or steps, overriding a global plan default. |
Beta Was this translation helpful? Give feedback.
-
Currently I use just the simplest scenario we use in beaker for ages, i.e. 1:1 test sets synchronized as However I can easily imagine following semi-multihost setup for clevis testing:
The thing is that the clevis tests may start after and only after the tang server is running and responding, i.e. the clevis SUT start needs to be synced to the tang readiness. Such scenario would be probably done by running the tang setup in the prepare step on one machine and tests in the execute step on the other machine where the execute step of the tang machine would be empty. The syncing would be done on the steps only. Together with what I proposed in #969 (comment) it could look like: discover:
- how: fmf
url: ...
tests:
- /setup/some_service
name: SETUP
- how: fmf
url: ...
filter: tag:!setup
name: TEST
prepare:
how: tmt
what: SETUP
where: server
execute:
how: tmt
what: TEST
where: client or discover:
- how: fmf
url: ...
tests:
- /setup/some_service
name: SETUP
where: server
- how: fmf
url: ...
filter: tag:!setup
name: TEST
where: client
prepare:
how: tmt
what: SETUP
execute:
how: tmt
what: TEST Depending on where we actually want to decide the tests assignment to the specific SUT. |
Beta Was this translation helpful? Give feedback.
-
Another multihost approach would be the controller-base or orchestrated. The idea is simple, one SUT would run special tests which would control the others which would just wait for instructions. The setup would be similar to the previous use-case syncing on the steps: discover:
- how: fmf
url: ...
filter: tag:!setup
provision:
- name: CONTROLLER
...
- name: SUT1
...
- name: SUT2
...
execute:
how: tmt
where: CONTROLLER Assuming there would be metadata available on the SUTs so the running tests on the CONTROLLER would know hostnames / IPs of the other SUTs. |
Beta Was this translation helpful? Give feedback.
-
The sync specification and implementation discussion can be also found here: #1705 |
Beta Was this translation helpful? Give feedback.
-
I am wondering - is there some implementation (of the original version of the specification) available for testing? |
Beta Was this translation helpful? Give feedback.
-
Controller approach - What is the idea? But I fail to understand how one write it. Questions:
Possible use case: Should the above be a plan similar to bellow and tmt will capture just the output of test (/a-to-b) ran on the controller? So the responsibility to log / capture everything will be on this test? In other words: tmt in the execute step will only connect to the controller. How about test dependencies? Controller will have different set than server/client. Should the burden of installing requires be moved to the prepare script? E.g. /a-to-b on controller needs beakerlib, on server foo-server and on client foo-client.
|
Beta Was this translation helpful? Give feedback.
-
I would like to follow up a bit on a topic discussed on yesterday's mtg, I mean the question of parallelism of discover configs.
Discover says that those two tests should be run on controller. What does that mean for server and clients? Is there an assumption that these systems are blocked until tests running on controller are finished? Clearly we do not want to dispose those systems while tests are running but from where in multihost design is this behavior comes from? It becomes even more unclear when multiple configs are present in a discover step (to simplify it I am using the other syntax with sync in the dicover step)
What a user would like to get is to have server and client systems synced at the beginning of each test from "peer tests" set and then sync ALL systems synced at the beginning of "controller tests" and last until the very end. However, one could also understand this syntax differently. It could mean that "peer tests" are run just on client and server and since controller is not required here it can proceed with the other config and run "controller tests". This would clearly lead to undesired behavior. So the crucial question here is: Does Just to present a use case for the 1st behavior (without controller role):
Here we benefit from server and client configs being run in parallel.
which utilize the parallelism and also the fact that server and client tests may have different names. This could in fact become handy, e.g. when server part is doing some generic setup while clients when be performing different tasks against this setup. I could also add more discover configs changing the setup on the server and then running again various client tasks. This would allow me to test multiple scenarios in a single test plan. I think that this behavior is even more important when Above I tried to present on examples why I believe it is crucial to define the expected behavior for the And just to make it even more complicated. |
Beta Was this translation helpful? Give feedback.
-
The question from the last meeting discussion. In the example pasted below, which guests should be synchronized in the beggining? All guests or guests specified by
|
Beta Was this translation helpful? Give feedback.
-
I have also another question or proposal for the |
Beta Was this translation helpful? Give feedback.
-
And the last topic for the discussion. What do you think about the implicit synchronization barrier between all step phases? I mean all provision steps for all guests start at once, then all prepare steps start at once, execute, report, etc. Do you have any preferences about this? |
Beta Was this translation helpful? Give feedback.
-
I've spent some time thinking about synchronization yesterday and I think we should consider making things as simple as possible. We didn't really gather much real-life scenarios (it would be nice to have them gathered somewhere - I mean not links to existing tests but rather an outline how the tmt plan would look like in some proposed/abstract syntax) and it therefore seems to me that we may be over-engineering the syntax too much without having a real need for high flexibility. I will be talking about this example
My question is: Is there a real need to run subsequent phases (or steps) in parallel? For the example above it means, we we need to be able to run server setup and client setup together? I think it will make the implementation more complicated and I am not sure this use case is really necessary/justified. We should make things simple in So my proposal goes a bit against what is mentioned by @sopos above. I would propose that each phase (one named block in discover above) would be synced at the beginning for all provisioned systems independently on whether these systems are mentioned in The above applies to phases, what remains is sync between individual tests. For that I think we could utilize options that have been already proposed (by name, index etc.). |
Beta Was this translation helpful? Give feedback.
-
My take. I'd like to focus on some basics, multiple guests, and multiple phases to be run on each guest, with some parallelization. Let's start with the following tmt configs, nothing fancy: discover
- name: server setup
how: fmf
tests:
- /server/setup
where:
- server
- name: client setup
how: fmf
tests:
- /client/setup
where:
- client1
- client2
- name: the actual test
how: fmf
tests:
- /test1
- /test2
- /test3
where:
- server
- client1
- client2
- name: another test
how: fmf
tests:
- /test4
- /test5
where:
- client1
- client2
- name: yet another test
how: fmf
tests:
- /test6
where:
- client1 I see the parallel aspect as being crucial and unavoidable. tmt can perform each guest queue after another, in a true sequential manner, but then I see much harder support for the server/client test scenarios:
This seems to me like the very basic prototype we could land.
My point was, AFAICT, this would be the very basic prototype supporting both trivial scenarios, "run plan on multiple hosts" and the very basic "run tests on server and client". User libraries can take care of synchronizing individual tests as needed, tmt would guarantee phases would begin at the same time & can export necessary variables, to describe the topology. On top of this, we should be able to build more features. The bracketing @sopos suggested I accidentally illustrated in my previous diagrams. Or any other custom synchronization or parallelization modes, the core implementation would exist and would be already testable for the basic use cases. Now, I'm adding yet another idea on top of the pile of ideas here, but I'd like to ask you to bear with me. I tried to add a diagram to illustrate the workflow, and I'd like to discuss various use cases or ideas & try to amend them with this proposal. If you something that's an obvious blocker, please, point it out. |
Beta Was this translation helpful? Give feedback.
-
The problem I see with such parallelism is that it is in fact more tricky than it seems. At first we need to implement some logic that would be deciding which phases on which systems can be run or which systems can wait. E.g.
What should be the rigth sequence?
or
I know you will say the first one but the problem is that it is not obvious at the first sight, especially if there would be more systems involved. The other problem is that the question which systems should run a test script (
presents this discrepancy quite well I guess. Do we want to run both actions in parallel (race condition present) or one phase after the other?
|
Beta Was this translation helpful? Give feedback.
-
I am not saying there is not a solution for presented questions/problems. I am saying that what you are proposing is in fact not a MVP since you are introducing additional complexity that is not necessary for multihost test execution but rather optimize resource utilization. Our current multihost tests scheduled by wow don't produce such complicated recipes. I can't see a reason why things you are proposing can't be added later. |
Beta Was this translation helpful? Give feedback.
-
Hello. With bits from #1790 and minor adjustment I was able to successfully run our keylime multihost test on 3 (and also 4) guests using 1minutetip. Details are in the PR itself. There are still some things to improve (like logging in the execution step) but in general it seems to work. |
Beta Was this translation helpful? Give feedback.
-
A quick update on #1790:
|
Beta Was this translation helpful? Give feedback.
-
A quick update on #1790:
|
Beta Was this translation helpful? Give feedback.
-
A quick update on #1790: State of things
Loose ends
|
Beta Was this translation helpful? Give feedback.
-
@happz Hi, I tested the latest Also, it seems to me that tests have not been truly executed in parallel on assigned hosts. At least I am not able to tell that from the log, the output looks weird and it is not really obvious which system was running which command. TBH I liked more parallel logging with role/system prefix. |
Beta Was this translation helpful? Give feedback.
-
I tested the previous packit build from #1790 (tmt-1.22.dev-1.20230321144934500513.pr1790.48.gc8f7556.fc37.noarch) and I confirm that it works very well for multihost tests using /distribution/sync library (alternative approach to using rhts-test-sync and rhts-test-block commands). There were a couple of issues mentioned already in the comments above:
There is one more additional issue affecting /distribution/sync library. TMT does not use/export TEST variable as Beaker does and the library is using this to create synchronization files in shared storage space. This can be fixed easily in the library. I used the following plan executed from the libreswan test dist-git:
And executed it by:
Another issue related to /distribution/sync beakerlib library is that beakerlib in tmt looks to github.com/beakerlib for libraries but it won't find sync even though it is there because it looks for it in distribution directory but that can be fixed easily by moving the library for now (until tmt will be able to find the libraries better). One more thing, interactive execution should be blocked then using parallel execution. |
Beta Was this translation helpful? Give feedback.
-
Yes. I think that in fact those vars are matching the role exactly (no extra uppercase conversion) . See |
Beta Was this translation helpful? Give feedback.
-
Please, vote for the preferred mapping of the guest |
Beta Was this translation helpful? Give feedback.
-
This discussion is dedicated for announcing new multihost support topics, issues and pull requests. If you are interested in the area of detailed hardware specification you might want to subscribe to this discussion to get notified about the new stuff available for feedback.
Beta Was this translation helpful? Give feedback.
All reactions