Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak: message_filters::Synchronizer heap-use-after-free #66

Open
hanyin-intel opened this issue Nov 19, 2021 · 5 comments
Open
Assignees

Comments

@hanyin-intel
Copy link

Hi,

I'm working on ROS2 with Foxy version and in my project I find memory leak in message_filters::Synchronizer when testing with Sanitizer tool. So I write a small test case, then find that when shutting down the node, there is 8 bytes heap-use-after-free issue in message_filters::Signal1<sensor_msgs::msg::Image_<std::allocator<void> > >::removeCallback. Here is the code.

#include <signal.h>
#include <thread>
#include <rclcpp/rclcpp.hpp>
#include <message_filters/subscriber.h>
#include <message_filters/sync_policies/approximate_time.h>
#include <message_filters/time_synchronizer.h>
#include <sensor_msgs/msg/image.hpp>

class SLAMNode: public rclcpp::Node
{
public:
    SLAMNode(): rclcpp::Node("test"), sync_policy_(10),
    image_sync_(sync_policy_) {
        auto subscribe_image_topic = [this](ImageSubscriberFilter &sub, std::string topic, const std::string = "raw") {
        sub.subscribe(this, topic, rclcpp::SensorDataQoS().get_rmw_qos_profile());
    };
        subscribe_image_topic(sub_image_, "image");
        subscribe_image_topic(sub_depth_, "depth");
        image_sync_.connectInput(sub_image_, sub_depth_);
        image_sync_.registerCallback(
            std::bind(&SLAMNode::rgbd_callback, this, std::placeholders::_1, std::placeholders::_2));
    };
    ~SLAMNode() {};

    void rgbd_callback(const sensor_msgs::msg::Image::ConstSharedPtr msgRGB,
                       const sensor_msgs::msg::Image::ConstSharedPtr msgD) {
        std::cout << "Callback " << std::endl;
    }

private:
    typedef message_filters::Subscriber<sensor_msgs::msg::Image> ImageSubscriberFilter;
    typedef message_filters::sync_policies::ApproximateTime<sensor_msgs::msg::Image, sensor_msgs::msg::Image> sync_policy;
    sync_policy sync_policy_;
    message_filters::Synchronizer<sync_policy> image_sync_;
    ImageSubscriberFilter sub_image_, sub_depth_, sub_right_image_, sub_mono_;
};

int main(int argc, char **argv)
{
    rclcpp::init(argc, argv);
    std::shared_ptr<SLAMNode> node = std::make_shared<SLAMNode>();

    rclcpp::spin(node);
    rclcpp::shutdown();

    return EXIT_SUCCESS;
}

Regards,
Han

@hanyin-intel
Copy link
Author

The OS distribution I'm working on is Ubuntu 20.04 and the GCC version is 9.3.0. You can get all the necessary files from this repo and also get the log output in the repo root directory. The command we use to build the package is colcon build --packages-select simple_test --cmake-args "-DUSE_SANITIZER=ON" and the command-line to run it is ros2 run simple_test simple_test.

@clalancette
Copy link
Contributor

I took a bit of a look at this.

The problem is in the order of destruction between message_filter::Subscriber and message_filter::Synchronizer. If you have the Synchronizer before the Subscriber in the class members (as you do above), then the Synchronizer gets constructed before Subscriber, but the Subscriber gets destroyed before the Synchronizer. That ends up causing the problem, because the destruction of the Synchronizer tries to unsubscribe from everything attached to it. Since the Subscriber has already been destroyed, you get the use-after-free.

If you reverse the order of the Synchronizer and the Subscriber (so that the Synchronizer comes last in the class member variables), then the problem goes away.

That said, this is pretty subtle and pretty fragile. It would be best if the Synchronizer took a reference to the Subscribers that are being attached to it, so that they would not be destroyed while the Synchronizer still needs them. That would require a change in API from:

Synchronizer::connectInput(F0 & f0, F1 & f1)

to something like:

Synchronizer::connectInput(std::shared_ptr<F0> f0, std::shared_ptr<F1> f1)

Another alternative would be to have Synchronizer register some kind of "deletion" callback with the Subscriber class, which the Subscriber class would call when it is being destroyed. This callback would then delete the Subscriber right then from the Synchronizer.

There may be other ways we could go here. @ros2/team thoughts?

@tfoote
Copy link
Contributor

tfoote commented Dec 1, 2021

The reference counted pointer is likely the cleanest. We just need to make sure it's not too easy to make a loop of dependencies that will block destruction.

@hanyin-intel
Copy link
Author

hanyin-intel commented Dec 1, 2021

Hi @clalancette, thanks for your reply, the heap-use-after-free problem goes away after I reverse the order but a new memory leak problem is generated. Here is what I got:

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7faa393cebc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
    #1 0x7faa390bc892  (/opt/ros/foxy/lib/librcl.so+0x15892)
    #2 0x7faa391cfb9a in rclcpp::graph_listener::GraphListener::GraphListener(std::shared_ptr<rclcpp::Context>) (/opt/ros/foxy/lib/librclcpp.so+0xe4b9a)
    #3 0x7faa391f3cbb in std::shared_ptr<rclcpp::graph_listener::GraphListener> rclcpp::Context::get_sub_context<rclcpp::graph_listener::GraphListener, std::shared_ptr<rclcpp::Context> >(std::shared_ptr<rclcpp::Context>&&) (/opt/ros/foxy/lib/librclcpp.so+0x108cbb)

And I see your discussion on this related issue, so is that means I need to re-build rclcpp from source on foxy branch to fix this new problem?

@clalancette clalancette self-assigned this Feb 4, 2022
@clalancette
Copy link
Contributor

I'm actually going to leave this open, since we do need to address the connecting inputs problem.

@clalancette clalancette reopened this Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants