Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor test suite to support starting components asynchronously #157

Draft
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

mszulcz-mitre
Copy link
Contributor

This pull request addresses Issue #127.

@mszulcz-mitre
Copy link
Contributor Author

mszulcz-mitre commented Aug 15, 2022

@HalosGhost So far, the only commit in this PR addresses the specific issue we previously discussed: the inability of the atomizer to wait to sync with watchtowers in atomizer_end_to_end_test.cpp. On my machine, at least, this commit makes the previous call to sleep between initializing the watchtowers and atomizer in the test unnecessary (line 50):

std::thread w_init_thread([&]() {
ASSERT_TRUE(m_ctl_watchtower->init());
});
std::this_thread::sleep_for(std::chrono::milliseconds(100));
ASSERT_TRUE(m_ctl_atomizer->init());

Is this sufficient to fix this test? I doubt it--but I don't have any way of knowing for certain right now. Unfortunately, I still can't reproduce the failures described by Issues #119 and #127. I've followed the instructions in #119:

  • The log level is already set to trace in two_phase_end_to_end_test.cpp.
  • I launch the tests though a shell in emacs.

I've also tried the following:

  • Running the tests in a docker container to slow things down (launched via emacs).
  • Running the tests in a docker container that's restricted to only use 1 CPU core to slow things down even more (launched via emacs).
  • Commenting out all the calls to std::this_thread::sleep_for(std::chrono::milliseconds(100)); (lines 38, 42, 43, 45):
    m_ctl_sentinel
    = std::make_unique<cbdc::sentinel_2pc::controller>(0,
    m_opts,
    m_logger);
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
    ASSERT_TRUE(m_ctl_shard->init());
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
    ASSERT_TRUE(m_ctl_coordinator->init());
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
    ASSERT_TRUE(m_ctl_sentinel->init());
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
    reload_sender();
    reload_receiver();
    std::this_thread::sleep_for(m_wait_interval);

I recall that you may have mentioned some other way to try to trigger the problems. Could you please remind me what you said?

Previously, the atomizer controller would make one attempt to connect to
watchtowers and would return false if the attempt failed.  This required
watchtowers to be initialized before atomizers, which required adding a
sleep to the atomizer end-to-end integration test so the watchtowers had time
to initialize.  This change enables the atomizer to make multiple attempts to
connect to watchtowers and precludes the need for the sleep in the
integration test.

Signed-off-by: Michael L. Szulczewski <[email protected]>
@mszulcz-mitre mszulcz-mitre force-pushed the refactor-tests-for-async-components branch from 139ec8d to 20d7ace Compare August 22, 2022 05:58
Previously, controllers for components such as the atomizer archiver,
sentinel, shard, and watchtower would make one attempt to connect to
other components and would return false if the attempt failed.  This
led to lots of failures when running
'docker compose --file docker-compose-atomizer.yml up'. This commit enables
components to make multiple attempts to connect.

Signed-off-by: Michael L. Szulczewski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant