-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A propagator-based approach to timeout a solve? #473
Comments
This approach should work. You could even add an empty clause to backtrack to the top-level right away. Do you watch all literals? You can also set the check mode to fixed point, and add a clause on the propagation fixed point. @BenKaufmann, it sounds a bit strange that it is faster to stop the search like this. Do you think one could simply improve the implementation? |
Thanks!!! To be clear, this approach does not make it faster to stop the search -- the stopping time is negligible for both cases from what I saw. What my approach does is:
|
It's strange that this make the second run more stable. I have no idea why this is the case. Could you share a minimal working example showing the difference. I cannot promise that we will take a look right away but we'll keep it on the stack. Regarding the check mode have a look at: |
Thanks!! I'll send you files tomorrow. |
Here are the files: files.zip
|
Thanks for the files. We'll try to investigate at some point. |
Thanks! I have an update that hopefully will simplify your investigation. I found that the strange behavior occurs with established benchmarks as well. I am attaching a new archive that contains (a) a simpler C++ program and (b) instances from an ASPCOMP 2014 benchmark domain where the issue occurs. (The issue occurs in other benchmark domains as well, but it is less evident.) Differently from the file I previously sent you, this new C++ program processes asp files rather than aspif files -- I did so because I wanted to rule out that the issue might be due to how I was loading the aspif files. You can reproduce my results with:
For your convenience, the instances already include the problem encoding. Sample run:
produces an output such as:
Similarly to my previous program, the output includes the outcome of two consecutive solves executed incrementally. The first solve uses the timeout specified by the second command-line argument (5 sec in the example) and the second solve has a fixed 60 sec timeout. The important information is on the two "best costs:" lines. The first one reports the cost of the best answer set found during the first solve. The second one reports the cost of the best answer set found during the second (incremental) solve. You can see in the example output how the second solve fails to get a cost close to that of the first solve in spite of having a substantially longer timeout and being executed incrementally. |
@mbalduccini @rkaminsk I investigated this issue today and AFAICS it all boils down to heuristic effects. The only conceptual difference between interrupting the search via a call to Regarding heuristics, the issue is twofold. First, because of the timeout, the result of the first search is highly non-deterministic. Second, the suggested way that conflicting clauses are added in the propagator influences certain heuristics. As written, the propagator adds conflicting clauses with a literal from the highest decision level. When resolving the conflict, the solver will backtrack without saving phases of literals from the highest level. Furthermore, it will then add a new volatile clause thereby potentially bumping scores of the contained variables. Finally, depending on the problem and search state, the propagator might need to add multiple such clauses before the step becomes unsat. To get a better understanding of what is happening, I replaced the time-based approach with a deterministic conflict-based approach. I.e. instead of stopping the search after a given time, I stopped the search after a fixed number of conflicts (e.g. 8500/50000 for the CrossingMinimization problems). With this and the simpler empty-clause based propagator, the two approaches produce the exact same search spaces. For example, with
Finally, as mentioned by @rkaminsk in #453, option In summary, I don't think there is an easy/obvious "fix" for the observed behavior. We could think about adding some kind of "cancel/resume heuristic" and/or adding special handling for the case where the logic program is not changed between two consecutive solve calls. |
Hi,
In my investigation of the unexpected performance results I mentioned in #453, I attempted an alternative way of triggering a timeout in
Control::solve()
instead of usingSolveHandle::cancel()
. The approach, which I detail below, is motivated by the study of the implementation ofSolveHandle::cancel()
and especially ofSolver::setStopConflict()
. In experiments on (a small number of) challenging optimization problems, my approach almost completely eliminated the issues I had previously reported in #453, where I usedSolve::cancel()
. I was wondering if you see any problems with my approach and in general if you have any feedback or thoughts about it.The approach is:
Control::solve()
, I register a propagator viaclingo_control_register_propagator()
(it should be possible to use the C++ equivalent with the same results)timeout_triggered
to true and wait forever withSolveHandle.wait(-1.0)
timeout_triggered
is true, the propagator that I registered:lit
, from the change list that clingo passed to the propagator{ -lit }
of typeclingo_clause_type_volatile
clingo_propagate_control_propagate()
clingo_propagate_control_propagate()
failed)SolveHandle.wait(-1.0)
eventually returns and the main branch setstimeout_triggered
back to falseControl::solve()
can now be called again as neededThe intuition behind the approach is that, after
timeout_triggered
is set to true, the propagator causes all propagation attempts to fail and the solve eventually terminates. Then, whenControl::solve()
is called again, the clauses that the propagator had added are automatically removed since they were marked volatile, and thus the computation can resume without any permanent impact from the timeout.Thanks!
Marcello
The text was updated successfully, but these errors were encountered: