World's most popular travel blog for travel bloggers.

[Solved]: Minimax algorithm when all the options are the same

, , No Comments
Problem Detail: 

tl;dr What does the Minimax algorithm do when all its options are the same?

Consider this Minimax tree: (Green means the ends, orange minimize, blue maximize) enter image description here (Source: Myself)

Imagine this is the tree for a game (tic-tac-toe, maybe?). What should the computer do? Pick one at random? Pick the one which on average has the best score (the second one)?

Background

I'm making a program to play Quoridor, and I figure that the Minimax algorithm is the best choice (with some pruning of the really bad ones; otherwise, it would be way too big). The problem is, every choice you can make (the beginning ones in particular) have a chance to loose, and every loss has the same value (-1). So how can I pick the best moves, if all the options are the same?

Asked By : APCoding

Answered By : Carlos Linares López

Shaull's answer is absolutely correct, and by referring Zermelo's theorem it is pointing in the right direction.

However, beyond the observations done on the rationality of your opponent, the point here is that Minimax is pretty good for strategical games where a single mistake could lead to a disaster, e.g., chess ---where just overlooking a single move can lead to checkmate. On the other hand, if no such strategical flaws are likely (e.g., Omweso and I think Quoridor also to some extent) then Monte-Carlo sampling is a much better option.

For a thorough study on the topic, see:

Raghuram Ramanujan and Bart Selman. Trade-Offs in Sampling-Based Adversarial Planning. Proceedings of the Twenty-First International Conference on Automated Planning and Scheduling. Freiburg, Germany, 2011.

(which was actually awarded a honorable mention for best student paper). In this case, the purpose of such algorithm is to follow good lines of action instead of preventing a loss. Monte-Carlo sampling is also the best option to use in case your branching factor is too large (so that minimax and any of its variants such as alpha-beta become infeasible), e.g., Go.

Be aware however, that one of the options used much often in Monte-Carlo sampling, UCT, does not seem to be so good in the end. For a good discussion on the topic, see:

Carmel Domshlak, Zohar Feldman. To UCT, or not to UCT?. Proceedings of the Sixth Annual Symposium on Combinatorial Search. Leavenworth, Washington, USA, 2013.

Hope this helps,

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/53665

3.2K people like this

 Download Related Notes/Documents

0 comments:

Post a Comment

Let us know your responses and feedback