Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Events #103

Merged
merged 40 commits into from
Mar 26, 2024
Merged

Support Events #103

merged 40 commits into from
Mar 26, 2024

Conversation

Yadunund
Copy link
Member

@Yadunund Yadunund commented Jan 31, 2024

The PR adds support for two independent event mechanisms

  • Registering and triggering of user defined callbacks when messages, requests and responses arrive -> Used by EventsExecutor.
  • Registering and triggering of user defined callbacks when the ROS Graph changes -> QoS events such as MatchedEvent. Support selective QoS based events #118

How events function in rmw

  1. Event callbacks are set by the client library and to be triggered by the rmw layer when a new message/request/response arrives. Then, the client library can call rmw_event_take() to get the event instead of waiting on condition variables in rmw_wait().
  2. Event callbacks for QoS events may be set. Presently only publisher/subscription events statuses are defined in rmf/events.h. Here rmw_publisher_event_init and rmw_subscription_event_init methods will instantiate the event based on the event_type supplied. The RMW implementation has full discretion to initialize/reject (by returning a nullptr) an event initialization request based on what events the implementation supports. The rmw_events_type_t definition enumerates the types of events that can be supported while an internal events_map data structure defines which events are supported by rmw_zenoh. Once an event is initialized, rcl will call rmw_event_set_callback() to register a callback for the event initialized previously. Then in rmw_wait, the RMW implementation should attach a condition variable to the event. The RMW implementation's job is to then track changes to the ROS Graph that are relevant to the pub/sub entities created in the current session and for which the callbacks have been registered. When a change is detected, an "Event instance" is to be added to an "event queue" and the attached condition variable is used to notify rmw_wait that an event has occurred. Finally, upon notification, rcl will invoke rmw_take to take ownership of the event.

Changes

This PR makes a big change to the internal data structures used to store publisher, subscription, service, and client data. All of these classes now inherit an EventsBase class. While this approach was discussed earlier, it was apparent during the implementation that casting these different classes to the same EventsBase class is necessary as the rmw_events_type_t definition bundles events for publishers and subscriptions. It also minimizes the code duplication when implementing rmw_take and while setting/triggering callbacks.

This PR adds several new features and fixes bugs while making some big changes:

  • GraphCache
    • Allow event callbacks to be registered within GraphCache for local pub/subs.
    • Data structure changes to track and store all details of any entity added to the graph, ie store liveliness::Entity entries
    • Added mechanisms to keep track of event changes and trigger event callbacks.
    • struct TopicStats is removed to simplify nested data structures.
    • Fixed a bug where services and clients were not removed from the GraphcCache when their node is deleted before entities (this behavior is a bug in rclcpp but atleast the consequence is now properly handled).
    • No more lambdas
    • No more hardcoded indices in liveliness.cpp. Enums are used to keep track of segments in the liveliness tokens.
  • QoS events
    • RMW_EVENT_REQUESTED_QOS_INCOMPATIBLE: We support it but this will never be triggered since there are no QoS incompatibilities in zenoh.
    • RMW_EVENT_SUBSCRIPTION_MATCHED: A match is reported when a sub is added to the graph and there are other pubs with the same topic and type. QoS does not need to be checked for same reasons as above.
    • RMW_EVENT_OFFERED_QOS_INCOMPATIBLE: Supported but never triggered as stated above.
    • RMW_EVENT_PUBLICATION_MATCHED: A match is reported when a pub is added to the graph and there are other subs with the same topic and type. QoS does not need to be checked for same reasons as above.
    • QoS compatibility checking was also updated to produce a warning for TRANSIENT_LOCAL pub and VOLATILE sub durabilities. All other combinations are COMPATIBLE to reflect zenoh protocol's behavior.
  • Liveliness tokens published now embed zenoh session id, unique id of the entity within the session and the unique id of the node. If the entity is a node, this id is appended twice to keep the number of segments consistent. This allows
    • Uniquely track any entity in the system with a combination of zenoh session id, node id and entity id.
    • Mapping changes to the graph cache to events.
    • Uniquely track two nodes with the same name from the same session (ie load two talker nodes into the same rclcpp_components::ComponentManger
  • rmw_X_data_type_t
    • No longer inherit EventsBase 🎉
    • Pub & Sub data types now store an EventsManager class. (There are no events defined for services and clients in rmw/events.h.
    • Sub, Client & Service data types store a DataCallbackManager class. (Publishers do not need to trigger any callbacks).

Testing

DataCallback tests: This functionality is utilized by the experimental::EventsEcecutor in rclcpp.

EventsExecutor support can be tested by running any of the demo executables from demo_nodes_cpp. This PR is helpful for building the executables to run the EventsExecutor.

Matched events

Follow instructions to run matched_event_detect from demo_nodes_py: https://github.com/ros2/demos/tree/rolling/demo_nodes_cpp#matched-event-detect

Note: I'm seeing some flaky outputs from demo_nodes_cpp executable and I think its coming from PublisherBase in rclcpp.

Nodes with same names from the same zenoh session

Save this text into a python file, source ROS 2, and execute it

import rclpy
from rclpy.node import Node
from rclpy.executors import SingleThreadedExecutor


class myNode(Node):
  def __init__(self):
    super().__init__("my_node")

def main(args=None):
  rclpy.init(args=args)
  node1 = myNode()
  node2 = myNode()
  executor = SingleThreadedExecutor()
  executor.add_node(node1)
  executor.add_node(node2)
  executor.spin()

if __name__ == "__main__":
  main()

In a second terminal run ros2 node list. You should be able to see two /my_node nodes. Without the changes here, this would not be the case as the GraphCache was only capable of handling a singe node name from the same zenoh session.

$ ros2 node list
WARNING: Be aware that there are nodes in the graph that share an exact name, which can have unintended side effects.
/my_node
/my_node

Incompatible QoS

The nodes in ros2/demos are not relevant here since Zenoh does not have the same QoS behaviors as DDS and hence the logic in the executables does not apply.

Future work

  • Implement incompatible type events (I started on this but realized the PR is getting too big so will save that for later)
  • Implement message lost events.

Some other takeaways:

  • The rmw API should really provide two separate enums, rmw_publisher_events_type_t and rmw_subscription_events_type_t as the current API does not enforce the fact that events registered for publishers should only be of the "publisher types" and vice versa for subscriptions.
  • With this PR, RViz is finally supported. Although imo Rviz should not expect events support in every RMW implementation for it to function.

@delete-merged-branch delete-merged-branch bot deleted the branch rolling February 7, 2024 16:24
@Yadunund Yadunund changed the base branch from yadu/qos-durability to rolling February 8, 2024 01:41
* Nest TopicQoSMap within TopicDataMap

Signed-off-by: Yadunund <[email protected]>

* Count matched pub/subs based on qos compatibility

Signed-off-by: Yadunund <[email protected]>

* Rename TopicDataMap to TopicTypeMap

Signed-off-by: Yadunund <[email protected]>

---------

Signed-off-by: Yadunund <[email protected]>
Copy link
Collaborator

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a first pass review. I didn't review the changes to the GraphCache in any detail; I'll do that next.

My biggest concern has to do with the class hierarchy we are introducing here. I've left a more detailed comment inline.

rmw_zenoh_cpp/src/detail/event.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/rmw_event.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/rmw_data_types.hpp Outdated Show resolved Hide resolved
Support selective QoS based events
@Yadunund Yadunund marked this pull request as ready for review February 26, 2024 14:46
@Yadunund
Copy link
Member Author

Yadunund commented Feb 26, 2024

@clalancette i've overhauled this PR and updated the description as well. Keeping track of messages lost is the only major event yet to be supported but I think this PR is getting way too big. I will work on that once this is reviewed and merged as it may also require changes from your GID PRs.
On that note, one thing I did in this PR was to track the hash of the liveliness token's keyexpr as a GUID for the entity. I intentionally did not call it GID to mark the difference. One thing we could could do is replace the std::size hash value to an rmw_gid_t type instead but still generate the GID based on the unique hash? Let me know!

rmw_zenoh_cpp/src/detail/event.cpp Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/event.cpp Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/event.cpp Outdated Show resolved Hide resolved
Comment on lines +48 to +51
};

/// Helper value to indicate the maximum index of events supported.
#define ZENOH_EVENT_ID_MAX rmw_zenoh_event_type_t::ZENOH_EVENT_PUBLICATION_MATCHED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty minor, but our typical style here is to add a "SENTINEL" value at the end of the enum, and check against that instead. So this would change to:

Suggested change
};
/// Helper value to indicate the maximum index of events supported.
#define ZENOH_EVENT_ID_MAX rmw_zenoh_event_type_t::ZENOH_EVENT_PUBLICATION_MATCHED
ZENOH_EVENT_INVALID
};

And then the checks in event.cpp would change to >= ZENOH_EVENT_INVALID.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If we adopt this approach, the array lengths of these members would now be ZENOH_EVENT_INVALID, eg std::condition_variable * event_conditions_[ZENOH_EVENT_INVALID]{nullptr} which presents a readability problem imo. If you think it's valid, I'm happy to make the change as suggested.

rmw_zenoh_cpp/src/detail/graph_cache.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/graph_cache.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/graph_cache.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/graph_cache.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/graph_cache.cpp Outdated Show resolved Hide resolved
rmw_zenoh_cpp/src/detail/graph_cache.hpp Outdated Show resolved Hide resolved
@Yadunund
Copy link
Member Author

@clalancette this is ready for another review.

@Yadunund Yadunund merged commit 5e9a06c into rolling Mar 26, 2024
3 of 5 checks passed
@delete-merged-branch delete-merged-branch bot deleted the yadu/events branch March 26, 2024 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants