Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hotfix/memfile-ack-logic-deadlock #1795

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rex-schilasky
Copy link
Contributor

Description

If the shm data transport layer acknowledgment feature is used and the ack time is set to a large value the publisher send may wait for a this long time on the ack event from the subscriber even the subscriber is no more existing (process is terminated for example). The SHM Datawriter Connect/Disconnect API is locked as long the Datawriter is waiting for the ack signal. That leads to a lock of the Registration layer finally.

Solution:

  1. Add RemoveSubscription to CDataWriterSHM to react on un-registrations from the registration layer in general
  2. Decouple Connect/Disconnect functionality of CSyncMemoryFile from SyncContent() to be able to modify the map of the memfile connection events even SyncContent() is waiting on a sync event.

… to "deadlock" in case of very long ack timeout setting
@rex-schilasky rex-schilasky added the cherry-pick-to-NONE Don't cherry-pick these changes label Nov 14, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

ecal/core/src/readwrite/shm/ecal_writer_shm.h Outdated Show resolved Hide resolved
rex-schilasky and others added 3 commits November 14, 2024 12:10
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
void CDataWriterSHM::RemoveSubscription(const std::string& host_name_, const int32_t process_id_, const std::string& topic_id_)
{
// we accept local disconnections only
if (host_name_ != m_attributes.host_name) return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that this is correct. If we communicate through docker images, we might have different host_name, but identical host_group.
Anyways, who applies the RemoveSubscription?
If you remove only subscriptions from IDs which have previously been subscribed, you don't need to do any checks here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-to-NONE Don't cherry-pick these changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants