You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What's the best way to get the best path seen over the entire search in DPW? This is the sequence of s,a,r's with the best total reward encountered over all samples, which probably occurs during a rollout.
If it doesn't currently exist, how can I implement it?
I see that there is a new action_info architecture where extra info can be returned from action. But you don't get the rollout portion of the sequence because it is hidden in estimate_value which calls RandomSolver. So is the easiest way to write my own rollout function that wraps the existing one? Or is there a better way?
Thanks!
The text was updated successfully, but these errors were encountered:
Hmm... yeah, that seems kind of hard right now :/ I definitely didn't plan for it when writing. Do you even know how to get the portion of the trajectory from the tree search? Since simulate is called recursively, It seems like you have to pass more arguments into simulate to keep track of the trajectory.
If you just want the rollout portion, yes, you would just need to implement a new type for estimate_value that keeps track of such things, but I think you will have to write your own version of MCTS.simulate and maybe a few other functions to keep track of the entire trajectory including when it traverses the tree. You could still use the existing tree structures, etc. though.
What's the best way to get the best path seen over the entire search in DPW? This is the sequence of
s,a,r
's with the best total reward encountered over all samples, which probably occurs during a rollout.If it doesn't currently exist, how can I implement it?
I see that there is a new
action_info
architecture where extra info can be returned fromaction
. But you don't get the rollout portion of the sequence because it is hidden inestimate_value
which callsRandomSolver
. So is the easiest way to write my own rollout function that wraps the existing one? Or is there a better way?Thanks!
The text was updated successfully, but these errors were encountered: