The bandit slippery walk problem is a reinforcement learning problem in which an agent must learn to navigate a 7-state environment in order to reach a goal state. The environment is slippery, so the ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results