Minimax algorithm incorrectly acting upon recognizing win/loss scenarios
Question:
The following minimax algorithm in python is meant to find the values of possible moves in a given connect four board. It is called by another function, computer_move
, which is also shown below.
Expected behavior: minimax
will return evaluations of a position based on a board state and a player.
-
If the player is one, it returns the evaluation of the highest value move.
-
If the player is two, it returns the evaluation of the lowest value move.
minimax
is called by computer_move
, which gets the returned values and chooses the best from them.
Actual behavior: computer_move
seems to successfully choose the best move from minimax
‘s returned values, but minimax does not properly evaluate different moves. Specifically, winning cases or cases where a win can be prevented are not properly evaluated. Intermediate situations have unknown behavior because the static evaluation function has not been implemented yet. However, in cases where a win is one-off by either player, the algorithm fails to react correctly.
I’ve tried switching various signs and swapping mins/maxes in both functions, but this did not seem to resolve the issue. Careful use of print statements also showed me that computer_move
is correctly processing returned values, but minimax
is returning incorrect values, suggesting that there’s some error with the algorithm. However, it seems to be a textbook minimax algorithm, at least as far as I can see.
Does anyone have some suggestions for what the issue might be? Thanks!
Minimax function:
def minimax(board, depth, alpha, beta, player, move):
# Base cases
if check_winning(board, 1) or check_winning(board, 2):
player = player*2-3 #Convert to +/-1
return player*math.inf
elif depth==0:
return static_eval(board)
# Player 1 finds highest value move
if player==1:
max_eval = -math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
eval = minimax(board, depth-1, alpha, beta, 2, move)
unmake_move(board, move)
max_eval = max(max_eval, eval)
alpha = max(alpha, eval)
if beta <= alpha:
break
return max_eval
# Player 2 finds lowest (for player one) value move
else:
min_eval = math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
#print_board(board)
eval = minimax(board, depth-1, alpha, beta, 1, move)
unmake_move(board, move)
min_eval = min(min_eval, eval)
beta = min(beta, eval)
if beta <= alpha:
break
return min_eval
Calling the minimax function and implementing the best move:
# Get list of move values, and choose highest/lowest depending on player
def computer_move(board, player, difficulty=3):
move_vals=[]
for move in range(COLUMNS):
if(valid_move(board, move)):
move_vals.append(minimax(board,
difficulty,
-math.inf,
math.inf,
player,
move))
else:
move_vals.append(-math.inf)
min_val = min(move_vals)
max_val = max(move_vals)
if(player==1):
move = move_vals.index(max_val)
else:
move = move_vals.index(min_val)
make_move(board, move, player)
print('Here is the computer's move:')
print_board(board)
Answers:
One problem is here:
if check_winning(board, 1) or check_winning(board, 2):
player = player*2-3 #Convert to +/-1
return player*math.inf
In your case you are sending back a score of -infininty if player 1 is winning and it is his turn. Split the cases up into different statements instead which will make it easier to debug. Also you don’t need depth == 0, you can check for when the board is full instead and then it is a draw if that is the case.
if check_winning(board, current_player):
return math.inf
if check_winning(board, the_other_player):
return -math.inf
if is_board_full(board):
return 0
You probably also want to include the depth to always go for the shortest possible win/longest possible lose, and you also don’t need to have such large values as inf:
if check_winning(board, current_player):
return (10+depth)
if check_winning(board, the_other_player):
return -(10+depth)
if is_board_full(board):
return 0
It turns out there were two issues:
- Because the
minimax
function is hardcoded to have player one maximize and player two minimize, player one winning must return a positive value, and player two winning must always return a negative value
- The
computer_move
function needs to add the potential moves to the board before calling the minimax
function, and unmake them after
Here is the updated code:
def minimax(board, depth, alpha, beta, player, previous_move):
# Change 1
if check_winning(board, 1):
return 100+depth
if check_winning(board, 2):
return -(100+depth)
elif depth==0:
return static_eval(board)
# Player 1 finds highest value move
if player==1:
max_move_eval = -math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
move_eval = minimax(board, depth-1, alpha, beta, 2, move)
unmake_move(board, move)
max_move_eval = max(max_move_eval, move_eval)
alpha = max(alpha, move_eval)
if beta <= alpha:
break
return max_move_eval
# Player 2 finds lowest (for player one) value move
else:
min_move_eval = math.inf
for move in range(COLUMNS):
if valid_move(board, move):
#print_board(board)
make_move(board, move, player)
move_eval = minimax(board, depth-1, alpha, beta, 1, move)
unmake_move(board, move)
min_move_eval = min(min_move_eval, move_eval)
beta = min(beta, move_eval)
if beta <= alpha:
break
return min_move_eval
def computer_move(board, difficulty, player):
other_player = int(not bool(player-1))+1
move_vals=[]
for move in range(COLUMNS):
if(valid_move(board, move)):
# Change 2
make_move(board, move, player)
move_vals.append(minimax(board,
difficulty,
-math.inf,
math.inf,
other_player,
move))
unmake_move(board, move)
else:
if(player==2):
move_vals.append(-math.inf)
if(player==1):
move_vals.append(math.inf)
min_val = min(move_vals)
max_val = max(move_vals)
if(player==1):
move = move_vals.index(max_val)
else:
move = move_vals.index(min_val)
make_move(board, move, player)
print('Here is the computer's move:')
print_board(board)
The following minimax algorithm in python is meant to find the values of possible moves in a given connect four board. It is called by another function, computer_move
, which is also shown below.
Expected behavior: minimax
will return evaluations of a position based on a board state and a player.
-
If the player is one, it returns the evaluation of the highest value move.
-
If the player is two, it returns the evaluation of the lowest value move.
minimax
is called by computer_move
, which gets the returned values and chooses the best from them.
Actual behavior: computer_move
seems to successfully choose the best move from minimax
‘s returned values, but minimax does not properly evaluate different moves. Specifically, winning cases or cases where a win can be prevented are not properly evaluated. Intermediate situations have unknown behavior because the static evaluation function has not been implemented yet. However, in cases where a win is one-off by either player, the algorithm fails to react correctly.
I’ve tried switching various signs and swapping mins/maxes in both functions, but this did not seem to resolve the issue. Careful use of print statements also showed me that computer_move
is correctly processing returned values, but minimax
is returning incorrect values, suggesting that there’s some error with the algorithm. However, it seems to be a textbook minimax algorithm, at least as far as I can see.
Does anyone have some suggestions for what the issue might be? Thanks!
Minimax function:
def minimax(board, depth, alpha, beta, player, move):
# Base cases
if check_winning(board, 1) or check_winning(board, 2):
player = player*2-3 #Convert to +/-1
return player*math.inf
elif depth==0:
return static_eval(board)
# Player 1 finds highest value move
if player==1:
max_eval = -math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
eval = minimax(board, depth-1, alpha, beta, 2, move)
unmake_move(board, move)
max_eval = max(max_eval, eval)
alpha = max(alpha, eval)
if beta <= alpha:
break
return max_eval
# Player 2 finds lowest (for player one) value move
else:
min_eval = math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
#print_board(board)
eval = minimax(board, depth-1, alpha, beta, 1, move)
unmake_move(board, move)
min_eval = min(min_eval, eval)
beta = min(beta, eval)
if beta <= alpha:
break
return min_eval
Calling the minimax function and implementing the best move:
# Get list of move values, and choose highest/lowest depending on player
def computer_move(board, player, difficulty=3):
move_vals=[]
for move in range(COLUMNS):
if(valid_move(board, move)):
move_vals.append(minimax(board,
difficulty,
-math.inf,
math.inf,
player,
move))
else:
move_vals.append(-math.inf)
min_val = min(move_vals)
max_val = max(move_vals)
if(player==1):
move = move_vals.index(max_val)
else:
move = move_vals.index(min_val)
make_move(board, move, player)
print('Here is the computer's move:')
print_board(board)
One problem is here:
if check_winning(board, 1) or check_winning(board, 2):
player = player*2-3 #Convert to +/-1
return player*math.inf
In your case you are sending back a score of -infininty if player 1 is winning and it is his turn. Split the cases up into different statements instead which will make it easier to debug. Also you don’t need depth == 0, you can check for when the board is full instead and then it is a draw if that is the case.
if check_winning(board, current_player):
return math.inf
if check_winning(board, the_other_player):
return -math.inf
if is_board_full(board):
return 0
You probably also want to include the depth to always go for the shortest possible win/longest possible lose, and you also don’t need to have such large values as inf:
if check_winning(board, current_player):
return (10+depth)
if check_winning(board, the_other_player):
return -(10+depth)
if is_board_full(board):
return 0
It turns out there were two issues:
- Because the
minimax
function is hardcoded to have player one maximize and player two minimize, player one winning must return a positive value, and player two winning must always return a negative value - The
computer_move
function needs to add the potential moves to the board before calling theminimax
function, and unmake them after
Here is the updated code:
def minimax(board, depth, alpha, beta, player, previous_move):
# Change 1
if check_winning(board, 1):
return 100+depth
if check_winning(board, 2):
return -(100+depth)
elif depth==0:
return static_eval(board)
# Player 1 finds highest value move
if player==1:
max_move_eval = -math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
move_eval = minimax(board, depth-1, alpha, beta, 2, move)
unmake_move(board, move)
max_move_eval = max(max_move_eval, move_eval)
alpha = max(alpha, move_eval)
if beta <= alpha:
break
return max_move_eval
# Player 2 finds lowest (for player one) value move
else:
min_move_eval = math.inf
for move in range(COLUMNS):
if valid_move(board, move):
#print_board(board)
make_move(board, move, player)
move_eval = minimax(board, depth-1, alpha, beta, 1, move)
unmake_move(board, move)
min_move_eval = min(min_move_eval, move_eval)
beta = min(beta, move_eval)
if beta <= alpha:
break
return min_move_eval
def computer_move(board, difficulty, player):
other_player = int(not bool(player-1))+1
move_vals=[]
for move in range(COLUMNS):
if(valid_move(board, move)):
# Change 2
make_move(board, move, player)
move_vals.append(minimax(board,
difficulty,
-math.inf,
math.inf,
other_player,
move))
unmake_move(board, move)
else:
if(player==2):
move_vals.append(-math.inf)
if(player==1):
move_vals.append(math.inf)
min_val = min(move_vals)
max_val = max(move_vals)
if(player==1):
move = move_vals.index(max_val)
else:
move = move_vals.index(min_val)
make_move(board, move, player)
print('Here is the computer's move:')
print_board(board)