Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The add_analyzer fails to clean up...sometimes #50

Open
stonier opened this issue Aug 31, 2016 · 4 comments
Open

The add_analyzer fails to clean up...sometimes #50

stonier opened this issue Aug 31, 2016 · 4 comments

Comments

@stonier
Copy link

stonier commented Aug 31, 2016

When add_analyzer leaves, it utilises the bond mechanism to shutdown the bond, triggering an unloading of the analyzers on the aggregator side.

Looks like the bondpy mechanism however, isn't reliably ensuring the aggregator gets triggered. This results in an error message when you reload the same analyzers that were not unloaded:

[ERROR] [WallTime: 1472626077.441982] add_analyzers did not add any analyzers to diagnostic aggregator: Requested load from namespace /diagnostics/navi_common_diagnostic_analyzers which is already in use

Difficult to reproduce with small tests. Right now I'm only getting it on a robot with alot of software running. Even there, it does not occur 100%.

@stonier
Copy link
Author

stonier commented Aug 31, 2016

Investigating the bond mechanisms in bondpy.

@stonier
Copy link
Author

stonier commented Aug 31, 2016

Issue in bond_core#14

@stonier
Copy link
Author

stonier commented Sep 1, 2016

Investigating what nodelets do...they have the same problems even though it's cpp. However the loaders there make a service call to the nodelet manager just before breaking the bond. i.e. they are not relying on the 1) the bond informing the other end or 2) the bond breaking to make sure things work.

Could do the same here.

@trainman419
Copy link
Contributor

We've encountered some similar issues when using the add_analyzers service, and the most common case that I've seen is the bondcpp subscriber dropping messages and prematurely dropping the entire add_analyzers node.

All of the add_analyzers nodes are connecting to the same subscriber, and if they send all of their messages at the same time, it can temporarily overwhelm the bond's subscriber queue. the subscriber queue size is 30, and I've seen this happen with about a dozen add_analyzers nodes running (it looks like the bond publishers are sending at 2Hz, but the aggregator queue is only serviced at 1Hz).

I'm working on a patch that moves each bond to its own topic to alleviate this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants