Question from Artificial Intelligenge: A Modern Approach by Russell and Norvig (Exercise 2.1).
Suppose that the performance measure is concerned with just the first $T$ time steps of the environment and ignores everything thereafter. Show that a rational agent's action may depend not just on the state of the environment but also on the time step it has reached.
This question is extremely confusing to me. My initial thought is that this is obvious. A rational agent wants to maximize its performance, and the first $T$ time steps are a factor in the performance measure. So for instance, if the environment is in state $A$ at time step 1, the performance measure can be different than being in state $A$ at step 2 since the state of the environment in step 1 is relevant to the performance measure in the latter case. Thus as the performance measures can be different, the rational agent may make different actions.
Perhaps that is the answer, but I am still confused on why it matters that the performance measure is concerned with only a finite sequence of initial time steps. My interpretation of the question seems to make that irrelevant. Only the fact that the performance measure has some historical factor is of any concern.
Can anyone help clarify what is happening in this question?
Asked By : zrbecker
Answered By : Anton
This question is obvious as you told and its purpose is to ensure that the reader has understood a part of the chapter. You are right that the actions that the agent does will differ in the time period T. They will also differ after this period, because the agent's actions will have no value.
One example may be for a car agent that has to cover as much distance as possible and has limited fuel. If only the distance reached during the first hour counts, the car could take it in full throttle(exhausting the fuel very fast). However if there was no time limit the car agent should choose the optimal speed that maximizes the distance covered per fuel unit spent.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/14351
0 comments:
Post a Comment
Let us know your responses and feedback