Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watchdog timer causes issues when updating code via Android Studio #555

Open
FromenActual opened this issue Jan 11, 2023 · 6 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@FromenActual
Copy link

I'm mentoring a couple teams this season, and both have run into problems when updating their code with Android Studio, where the Robot Controller app takes a few minutes to launch after updating. Makes for some long programming sessions when you need to tweak a bunch of values.

I believe this is caused by a watchdog timer that exists in the Rev Control Hubs, that attempts to restart the RC app if it fails to report that it's alive for 10 seconds. The problem is that it takes roughly 10 seconds for the app to update, so the watchdog tries to restart the app while it's being updated, which causes it to go into some weird error state that can take minutes to recover.

I haven't dug super deep into this myself; I met with someone who has dug into this and explained it to me, so this may be a game of telephone 😉 They found a way to increase the watchdog timer from 10 seconds to 20 seconds, and that completely solved the problem, so that's pretty definitively the cause. I believe the watchdog process is called FtcAccessPointService, and only exists on the Rev Control Hubs (the 4 Google results for "FtcAccessPointService" all point to the Control Hub OS).

IMO the best solution would be to have the watchdog detect when the app is being updated, and disable itself until the update is finished. I'm not sure how difficult that is to implement, so an alternative solution could be to simply increase the timer value. For now, my workaround has been to have the Rev Hardware Client open while working, and force launch the app after updating.

@alan412
Copy link

alan412 commented Jan 11, 2023 via email

@cmacfarl cmacfarl added the bug Something isn't working label Jan 11, 2023
@FromenActual
Copy link
Author

I managed to get a log when this issue occurred. We were working on updating an OpMode, and had multiple successful uploads from Android Studio while connected wirelessly with ADB. However we finally experienced the problem again, where the RC app didn't launch for a long time (we stopped waiting after a couple minutes). I opened up Logcat and saved the contents here:

fail_log.txt

I'm not familiar enough to understand everything in the log, but some things I've noticed:

  • 11:13:12 - I believe that's when Android Studio started updating the app.
  • 11:13:22.767 (~10 seconds later) - the FtcAccessPointService detects the RC app hasn't been alive for 10 seconds, and attempts to relaunch it.
  • 11:13:23.864 - an uncaught null pointer is thrown in the Window Manager.

Seems to me the RC app was still being updated when FtcAccessPointService attempted to relaunch it. My guess is that the Window Manager tried to render something that wasn't yet updated, causing the null pointer. However that doesn't explain why it didn't attempt to restart the app 10 seconds later.

I've also noticed something called FtcRobotControllerWatchdogService. It seems each time we updated the code, that service is force stopped and restart is scheduled in 1000ms. However there's only 1 instance in the log where it appears to start, which is at 11:12:15.923. Curiously, that's right before we encountered the problem with the app not launching, so perhaps that's related? Like I said, I don't fully understand everything here, so I may be off-base.

I tried to recreate the problem with my own laptop, but have been unsuccessful; it's only occurred when my students upload code from their laptops. Mine is relatively new and faster than theirs, so this problem may be more prevalent on older/slower machines.

I hope this is helpful! I'll keep trying to track this down when I can, though time is short with our competitions coming up.

@lukasmwerner
Copy link

Hi All,

We've been dealing with similar issues. And also have been experiencing weird glitches when it comes to program execution. We found so far that "running" the program once before actually running it eliminates the issues.

It is totally possible that our issues could be unrelated but we can confirm encountering similar upload delays. To reduce that delay we started using scrcpy to remote control the control hub screen to manually launch the app.

Hope this helps,
Lukas
-Mentor/Alum #4511

@FromenActual
Copy link
Author

We found so far that "running" the program once before actually running it eliminates the issues.

I think we've been running or OpMode between updates, and have still encountered this problem. But my memory could be wrong, so I'll try to be aware of whether we're running the OpMode at least once to see whether there's a correlation.

we started using scrcpy to remote control the control hub screen

Haven't heard of this before, I'll have to try it out! It's also possible to launch the app via the Rev Hardware Client, which is what I've had my students use for simplicity.

@lukasmwerner
Copy link

Little Update: After a unfortunate qualifiers we ended up finding out some cabling issues that may have been part of our issues related to our weird program glitches.

@lukasmwerner
Copy link

Ok one last update... It turns out that there were a multitude of issues we encountered.

  1. Defective Tamiya connectors on our battery that caused intermittent power outages
  2. Using some function in OpenCV that caused a memory crash after 10 seconds

My suspicion is that as there were more crashes (in our case because of OpenCV bugs) we then ended up with longer wait times because of some exponential backoff timer somewhere, thus leading to longer waiting with uploads after android studio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants