Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak from enclave #311

Open
jimmy-mcelwain opened this issue Dec 10, 2024 · 10 comments · May be fixed by #319
Open

Memory leak from enclave #311

jimmy-mcelwain opened this issue Dec 10, 2024 · 10 comments · May be fixed by #319

Comments

@jimmy-mcelwain
Copy link

jimmy-mcelwain commented Dec 10, 2024

Describe the bug
Context ros2/rcl#1198 and Yaskawa-Global/motoros2#35

I believe that there is a possible memory leak in rmw_microxrcedds. In the micro-ROS rcl repo, the following lines duplicate a string and dynamically allocate the memory if RCL_MICROROS_COMPLETE_IMPL is set.

https://github.com/micro-ROS/rcl/blob/041a67a2e102e9c48ae2580f19b2677016fcf86a/rcl/src/rcl/init.c#L265-L273

This memory which is dynamically allocated for enclave via rcutils_strdup is deallocated in other (full) rmw implementations. For example:

https://github.com/ros2/rmw_fastrtps/blob/1128b57d336dfa8bf462a5f9147291f73faa8ed3/rmw_fastrtps_shared_cpp/src/rmw_init.cpp#L116

However it is never deallocated in rmw_microxrcedds. I believe that this is only a problem if RCL_MICROROS_COMPLETE_IMPL is set, because that is the only case where rcutils_strdup runs and sets aside dynamic memory.

I do not believe that the solution is as simple as freeing the memory in rmw_init_options_fini, because the code below copies the enclave data to static memory, and thus cannot be freed. I might be wrong about that.

https://github.com/micro-ROS/rcl/blob/041a67a2e102e9c48ae2580f19b2677016fcf86a/rcl/src/rcl/init.c#L105-L110

To Reproduce
I do not have a MWE, but it is apparent in MotoROS2 using the debug listener script. edit - I have a MWE below

Expected behaviour
I expect the memory set aside by rcutils_strdup to be freed by the rmw, as it is with other rmws.

System information (please complete the following information):

  • YRC1000
  • ROS 2: I am currently using Jazzy. I believe that the problem exists on Humble as well, though.
  • Version: I am currently using a fork off of 2.4.3

Please let me know if I need to provide any more information, or if I am doing something improperly. Thank you.

@jimmy-mcelwain
Copy link
Author

jimmy-mcelwain commented Dec 10, 2024

Here is a MWE. It's a silly program that doesn't accomplish anything other than to demonstrate this problem, but when I use valgrind I get the following results:

....
==18276== 222 bytes in 111 blocks are definitely lost in loss record 17 of 18
==18276==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==18276==    by 0x492E85C: rcutils_strndup (in /opt/ros/humble/lib/librcutils.so)
==18276==    by 0x495382D: rcl_init (in /opt/ros/humble/lib/librcl.so)
==18276==    by 0x48613D5: rclc_support_init_with_options (in /opt/ros/humble/lib/librclc.so)
==18276==    by 0x10942D: Ros_StateServer_TimerCallback (in /home/jimmy/everything_motoros2/Other/reproduce/install/lib/reproduce/reproduce)
==18276==    by 0x4961679: rcl_timer_call (in /opt/ros/humble/lib/librcl.so)
==18276==    by 0x4866078: ??? (in /opt/ros/humble/lib/librclc.so)
==18276==    by 0x486B0BF: rclc_executor_spin_some (in /opt/ros/humble/lib/librclc.so)
==18276==    by 0x486C0AE: rclc_executor_spin (in /opt/ros/humble/lib/librclc.so)
==18276==    by 0x109771: main (in /home/jimmy/everything_motoros2/Other/reproduce/install/lib/reproduce/reproduce)
.....

This suggests that it is the rcutils_strdup in rcl_init

@fujitatomoya
Copy link

@jimmy-mcelwain thanks for recreating the issue here, this does seem like a problem.

@pablogs9
Copy link
Member

I would need to check this deeper but the for of rcl that we maintain for micro-ROS is intended to be built with RCL_MICROROS_COMPLETE_IMPL to not defined. In a sense, this flag excludes the code that micro-ROS does not need from mainstream rcl.

In any case, I wonder why rmw is supposed to deallocate memory being allocated by rcl?

@gavanderhoorn
Copy link
Contributor

I would need to check this deeper but the for of rcl that we maintain for micro-ROS is intended to be built with RCL_MICROROS_COMPLETE_IMPL to not defined. In a sense, this flag excludes the code that micro-ROS does not need from mainstream rcl.

we understand.

But seeing as rcl maintainers suggest this is an issue with the RMW, we thought we'd report it anyway.

In any case, I wonder why rmw is supposed to deallocate memory being allocated by rcl?

That is also something I'm not sure about.

Perhaps @clalancette and/or @fujitatomoya can say something about that.

@clalancette
Copy link

Honestly, this seems like a bug in rcl (or, at least, in the expectations between rcl and rmw). I agree that it doesn't make sense for the rcl layer to allocate it, and the rmw layer to free it.

To fix it, we have to either:

  1. Leave the strndup in rcl_init, move the free into __cleanup_context, and remove the free from all of the RMW implementations, or
  2. Move the strndup out of rcl_init and into rmw_init, and leave the free in the RMW.

I would actually probably go for the latter; while it is annoying to have to duplicate that code among the various RMWs, I think it also makes more sense since this code is manipulating the rmw_init_options structure. But I don't have a strong opinion either way.

@fujitatomoya
Copy link

In any case, I wonder why rmw is supposed to deallocate memory being allocated by rcl?

yeah, thanks for pointing this out, i focused on memory leak issue. this is not good.

the thing is that some data of rmw_init_options_t needs to be updated via rcl_init before rmw_init, and a part of those is enclave which needs to be copied to rmw_init_options_t.

Move the strndup out of rcl_init and into rmw_init, and leave the free in the RMW.

i am also inclined to this option, it is important to keep the clear responsibility for memory management for maintenance.

@clalancette rmw_init requires rmw_init_options_t, so probably we need to have rmw_init(free)_options_enclave to allocate(deallocate) the memory in rmw implementation. what do you think?

i can patch this if we are good to go. i can also give it a shot to rmw_microxrcedds, as far as i see static memory management is done only on rmw_init_options_t structure, i think we need to call deallocator before returning that memory slot to the static buffer. @pablogs9 @gavanderhoorn am I understanding correctly? actually this is 1st time to see this code... 😅 )

@pablogs9
Copy link
Member

pablogs9 commented Dec 12, 2024

@fujitatomoya that's it. I guess that checking if init_options->enclave != nullptr and deallocating it using the default allocator would work.

When this responsibility is moved to rmw implementations, the correct option is not to allocate nor deallocate this as rmw_microxrcedds does not handle SROS2 security features.

@fujitatomoya
Copy link

@pablogs9 can you also close this since #318 is merged.

@jimmy-mcelwain
Copy link
Author

@fujitatomoya thank you, I appreciate it!

Are these changes expected to be backported and added to future humble and jazzy releases? Or is this change likely just for future ROS2 distributions?

@fujitatomoya
Copy link

No i do not think so. rolling fix adds the new interfaces, so unlikely backported to already released distro. let me see how jazzy and humble can be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants