I am new to reinforcement learning. Lately, I have learned Q-learning using the following tutorial.
Is Q-learning still possible if the environment is dynamic. Using the environment of the tutorial as an example, in some states it is possible to go from room 0 to room 1 and and it is not possible to go from room 0 to 4.
Is it possible to use Q-learning in a problem like that? if not is there an algorithm that handle dynamic environments?
Asked By : user655561
Answered By : Tim
In general you can't change the state and action spaces (rows and columns of your reward matrix) since that changes the underlying model that you are sampling with Q-learning. It'd be similar to changing the population your sampling when statistically sampling. Hence, you have to define a set of actions and states that don't change in order to use Q-learning. Instead of the house graph, you'd need something else. Ideally you'd have to come up with a set of states and actions that didn't change as the house's topology changed. E.g. actions could be strategy to use could be the action given the number of rooms in the house and total number of edges (note: this'd probably be a poor choice, but I needed an example.).
All that said Q-learning albeit a little slow is pretty robust to some types of disturbances. If the model doesn't change significantly (e.g. you add a room) and you are exploring a large state space (statistically), then you older policies (solved Q matrices) might be nice breadcrumb trails for your new algorithm's exploration strategy, and the algorithm may re-learn much faster. However, I don't know about the theoretical guarantees (look into terms like "cross learning").
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/24729
0 comments:
Post a Comment
Let us know your responses and feedback