Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actions: Reason for the default QoS for status_topic_qos? #1155

Open
mauropasse opened this issue May 2, 2024 · 2 comments
Open

Actions: Reason for the default QoS for status_topic_qos? #1155

mauropasse opened this issue May 2, 2024 · 2 comments

Comments

@mauropasse
Copy link
Contributor

The default durability QoS for status_topic for actions is TRANSIENT_LOCAL:

RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL,

While for the rest of action components, it's VOLATILE.

I'm trying to find the reason for the default policy TRANSIENT_LOCAL being set for action goal's "status" topic.
This policy means:

  • A late joiner subscription would receive messages published in the past by some publisher.

For Actions this means:

  • A newly created ActionClient would receive old goal statuses sent in the past by some ActionServer .
  • This can't happen as the ActionClient was not yet created, so ActionServer couldn't have sent status messages.
  • Bug: It might actually can happen - multiple Action Clients with same name - They all receive old statuses that the server sent to other clients with same name.

So it is necessary for some reason that I ignore, having TRANSIENT_LOCAL set for action goal's "status" topic?

I'm asking this since for some reason I had to change the action server to use VOLATILE, and now the clients don't find the servers when calling rcl_action_server_is_available(), since there is a mismatch in QoS between client and server, so number_of_publishers is zero in:

ret = rcl_subscription_get_publisher_count(
&(client->impl->status_subscription), &number_of_publishers);

If the default would be VOLATILE instead of TRANSIENT_LOCAL (which seems to make sense), there wouldn't be server discovery issues.

--- a/rcl_action/include/rcl_action/default_qos.h
+++ b/rcl_action/include/rcl_action/default_qos.h
@@ -28,7 +28,7 @@ static const rmw_qos_profile_t rcl_action_qos_profile_status_default =
-  RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL,
+  RMW_QOS_POLICY_DURABILITY_VOLATILE,
@mauropasse mauropasse changed the title Actions: Reason for the default QoS settings for status_topic_qos? Actions: Reason for the default QoS for status_topic_qos? May 2, 2024
@fujitatomoya
Copy link
Collaborator

@mauropasse as you mentioned, it seems that it should be good enough with reliable so that client will not miss the status from the server once goal request begins.

This can't happen as the ActionClient was not yet created, so ActionServer couldn't have sent status messages.

but, what if it does not complete the endpoint discovery yet? (status publisher and subscriber endpoints are created but not discovered, when the server sends the status message after goal request.) i think this is so unlikely but it could happen in theory? in that case, client could miss the status for the goal id until next status message is published by the server. there is a possibility the server could take a long time to execute the goal... that means the client cannot know the goal status for a while.

Bug: It might actually can happen - multiple Action Clients with same name - They all receive old statuses that the server sent to other clients with same name.

i think each action client checks the status message against the goal id. if the goal id on status message does not match the one the client has, it drops the message since it is not interesting for the client. technically in this multiple clients case, TRANSIENT_LOCAL does not work as expected (my expectation above) like single action client case above, unless the the status topic is controlled by @keyed.

i could be wrong, but this is what i think so far...

CC: @ros2/team

@mauropasse
Copy link
Contributor Author

Thanks for your reply @fujitatomoya

but, what if it does not complete the endpoint discovery yet? (status publisher and subscriber endpoints are created but not discovered, when the server sends the status message after goal request.)

If the endpoints are not discovered yet, I'm guessing rcl_action_server_is_available() would return "server not available", so the client should not make any goal requests.
Of course this API could not be called: wait_for_action_server() is not called, but client makes request anyway.

i think each action client checks the status message against the goal id.

Yes, that happens here. Nevertheless, the clients who didn't make goal requests shouldn't be receiving&discarding old statuses meant for other clients (similar to what's happening on ros2/rclcpp#2397)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants