Can EpsGreedyPolicy
get correct action, if actions depend on state?
#446
Unanswered
NeroBlackstone
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Good point - I am not sure why that policy uses function special_eps_greedy(s)
if rand() < 0.05
return rand(actions(m, s))
end
return greedy(s)
end
end
policy = FunctionPolicy(special_eps_greedy) but it would be nice to put fix |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I have checked the
POMDPTools
build-in EpsGreedyPolicy source code, and I think it can only select actions from action spaces.But if actions that can be taken are limited by certain states, (a function is used to get available actions from action space, depending on what state is now), this
EpsGreedyPolicy
can't select the correct action because it selects actions from full action spacesIs there any built-in function to do this?
If not implemented, I'm willing to contribute a policy. (Maybe need some help
Beta Was this translation helpful? Give feedback.
All reactions