Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wjwwood] feat(MultiThreadedExecutor): Added ability to handle exceptions from threads #2505

Open
wants to merge 3 commits into
base: rolling
Choose a base branch
from

Conversation

wjwwood
Copy link
Member

@wjwwood wjwwood commented Apr 16, 2024

Modified version of #2382

Janosch Machowinski and others added 2 commits April 13, 2024 12:29
…threads

This commit adds external exception handling for the worker threads,
allowing application code to implement custom exception handling.

feat(Executor): Added spin API with exception handler.


Signed-off-by: Janosch Machowinski <J.Machowinski@cellumation.com>
Signed-off-by: Janosch Machowinski <j.machowinski@cellumation.com>
@wjwwood wjwwood self-assigned this Apr 16, 2024
Copy link
Contributor

mergify bot commented Apr 16, 2024

⚠️ The sha of the head commit of this PR conflicts with #2382. Mergify cannot evaluate rules on this PR. ⚠️

Signed-off-by: William Woodall <william@osrfoundation.org>
@wjwwood
Copy link
Member Author

wjwwood commented Apr 16, 2024

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@wjwwood
Copy link
Member Author

wjwwood commented Apr 16, 2024

@jmachowinski unfortunately I did not get this one in under the wire, nightly CI has the farm locked up now (we should consider putting the deadline just before nightly jobs start...), but ultimately this set of changes were not ideal I think, maybe you had a big reason for passing it as an argument for spin(), but if you didn't I think the change to allow it in the executor options was probably a better approach that was less disruptive. The trade-off, however, is that it's up to each executor to make sure they account for the exception handler in the options, whereas the new signature in this approach makes that hard to ignore.

However, there still may be a path to the ExecutorOptions change in jazzy, but it will require an ABI change before the release, but one I think we really should make, which is to add a pimpl pointer in the Executor base class as well as one in the ExecutorOptions. That would allow us to extend the options and executor APIs without breaking ABI in existing releases if needed. I'll have to talk to the other maintainers about it, but maybe we can go that route.

@jmachowinski
Copy link
Contributor

@wjwwood Thanks for the effort you put into the PRs !

Your latest changes will break debugging. As soon as you catch an exception, and rethrow it, the only backtrack you will get will point to the rethrow. That was the reason for the code duplication that was going on.

@wjwwood
Copy link
Member Author

wjwwood commented Apr 16, 2024

I don't fully understand what you mean, but I can have a look tomorrow. Should be something we can fix after the API freeze.

@jmachowinski
Copy link
Contributor

I don't fully understand what you mean, but I can have a look tomorrow. Should be something we can fix after the API freeze.

#include <stdexcept>

void origin()
{
    throw std::runtime_error("Error in Origin");
}

void rethrow_fun()
{
    try
    {
        origin();
    }
    catch(const std::exception &e)
    {
        throw;
    }
}

int main()
{
    rethrow_fun();

    return 0;
}
(gdb) r
Starting program: /home/jm/ros2/ros2_ws/example/a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
terminate called after throwing an instance of 'std::runtime_error'
  what():  Error in Origin

Program received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737352586176) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737352586176) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737352586176) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737352586176, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7842476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff78287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff7ca2b9e in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff7cae20c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007ffff7cae277 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007ffff7cae52b in __cxa_rethrow () from /lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00005555555552c0 in rethrow_fun() ()
#10 0x00005555555552ea in main ()
(gdb) 
MultiThreadedExecutor::spin()
{
  spin([](const std::exception & e) {throw e;});
}

With the construct above, you will only get stack traces, pointing to the lambda inside of spin, but no one pointing to the origin of the throw. This makes debugging almost impossible.

But yes, changing this should have no effect on the external API/ABI

@wjwwood
Copy link
Member Author

wjwwood commented Apr 16, 2024

We decided to not break the API freeze for these prs, but I did get tentative approval to add a pimpl pointer to the Executor and ExecutorOptions classes (ABI but not API breaking), which should let us address this problem between now and the release or just after the release, using the pimpl and new APIs (which are allowed).

@alsora
Copy link
Collaborator

alsora commented Jun 22, 2024

@wjwwood is there any update on this?
It looks that we could just undo the changes to the spin function done in your last commit, and this should be good to go (like this 05a8b4c#diff-fd2db21239de2963e2248c8d3ef4042ebd4174ef3e4f9dbd252a375303398796R30)

const std::function<void(const std::exception & e)> & exception_handler)
{
try {
function();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is @jmachowinski trying to explain with #2505 (comment), i do not think this is gonna useful stack trace either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants