Best path seen over entire search #49

rcnlee · 2018-08-08T01:35:57Z

What's the best way to get the best path seen over the entire search in DPW? This is the sequence of s,a,r's with the best total reward encountered over all samples, which probably occurs during a rollout.

If it doesn't currently exist, how can I implement it?

I see that there is a new action_info architecture where extra info can be returned from action. But you don't get the rollout portion of the sequence because it is hidden in estimate_value which calls RandomSolver. So is the easiest way to write my own rollout function that wraps the existing one? Or is there a better way?

Thanks!

The text was updated successfully, but these errors were encountered:

zsunberg · 2018-08-09T02:38:22Z

Hmm... yeah, that seems kind of hard right now :/ I definitely didn't plan for it when writing. Do you even know how to get the portion of the trajectory from the tree search? Since simulate is called recursively, It seems like you have to pass more arguments into simulate to keep track of the trajectory.

If you just want the rollout portion, yes, you would just need to implement a new type for estimate_value that keeps track of such things, but I think you will have to write your own version of MCTS.simulate and maybe a few other functions to keep track of the entire trajectory including when it traverses the tree. You could still use the existing tree structures, etc. though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best path seen over entire search #49

Best path seen over entire search #49

rcnlee commented Aug 8, 2018

zsunberg commented Aug 9, 2018

Best path seen over entire search #49

Best path seen over entire search #49

Comments

rcnlee commented Aug 8, 2018

zsunberg commented Aug 9, 2018