Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with NDPluginFile when AutoSave=1 #291

Closed
MarkRivers opened this issue Sep 28, 2017 · 3 comments
Closed

Problem with NDPluginFile when AutoSave=1 #291

MarkRivers opened this issue Sep 28, 2017 · 3 comments

Comments

@MarkRivers
Copy link
Member

When using the NDFileTIFF plugin, AutoSave=1 and high frame rate in ADSimDetector the IOC crashes.

The problem has been traced to 2 different threads writing files, the NDFileTIFF_Plugin1 thread (which is correct) and the main NDFileTiff asyn port thread (which is incorrect). The problem has almost certainly been introduced by the changed behavior of the busy record asyn device support in the current master branch of epics-modules/busy. The record now processes whenever it gets a callback. If it is processing because of a callback it should not call the driver.

NDPluginFile sets NDWriteFile to 1 and does callbacks each time it writes a file in autosave mode. It then sets NDWriteFile to 0 when the write is complete. This is done in order to give feedback in the OPI that a write operation is happening. The problem is that if the callbacks come too quickly the busy record actually does call the driver. This means that a second thread is calling the TIFF library at the same time as the plugin thread, which leads to corruption and crashes.

I have temporarily fixed the problem with 9c14ce9. This removes the calls to set NDWriteFile to 1, so the busy record won't get these callbacks.

The correct solution is to fix the asyn device support for the busy record so that it does not call the driver due to callbacks, even if they come at a very fast rate. It currently detects if multiple callbacks have occurred since it last processed, but it does not handle it correctly.

@ulrikpedersen
Copy link
Member

is this related to #287 ?

@MarkRivers
Copy link
Member Author

is this related to #287 ?

It might be the same problem. The problem I am seeing is that when the busy record gets multiple callbacks quickly it can mistakenly send the readback value to the driver during record processing, rather than just processing the record with the new value and not sending it to the driver. When it sends 1 to the driver this can cause 2 threads to be talking to the file library (TIFF, HDF5, etc.) at the same time, which causes crashes.

#287 is a problem at iocInit. I am not sure that there are multiple callbacks there, but possibly. It will be interesting to see if #287 goes away once the busy record is fixed.

The correct fix is in asyn device support.

@MarkRivers
Copy link
Member Author

The busy record asyn device support has been fixed. The problem no longer occurs. I tested with the TIFF file plugin with AutoSave=1 and the simDetector running at ~500 frames/s.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants