why minimax don't choose the optimal solution in this situation

Question

im doing tictactoe project for cs50 course
when i was using minimax i find out the minimax in some situation couldnt find the optimal solution

here is my code :

"""
Tic Tac Toe Player
"""
import copy
import math

X = "X"
O = "O"
EMPTY = None


def initial_state():
    """
    Returns starting state of the board.
    """
    return [[EMPTY, EMPTY, EMPTY],
            [EMPTY, EMPTY, EMPTY],
            [EMPTY, EMPTY, EMPTY]]

board = initial_state()

def player(board):
    """
    Returns player who has the next turn on a board.
    """
    numO = 0
    numX = 0
    FirstPlayer = None
    for i in range(len(board)):
        for j in range(len(board[i])):
            if board[i][j] == O:
                numO += 1
            elif board[i][j] == X:
                numX += 1
    return X if numO == numX else O


def actions(board):
    """
    Returns set of all possible actions (i, j) available on the board.
    """
    possact = set()
    for i in range(len(board)):
        for j in range(len(board[i])):
            if board [i][j] == EMPTY:
                possact.add((i, j))
    return possact


def result(board, action):
    """
    Returns the board that results from making move (i, j) on the board.
    """
    boardcopy = copy.deepcopy(board)
    boardcopy[action[0]][action[1]] = player(board)
    return boardcopy
    

def winner(board):
    """
    Returns the winner of the game, if there is one.
    """

    for i in range(3):
        wonO = True
        wonX = True
        for j in range(3):
            if board[i][j] == O or board[i][j] == EMPTY:
                wonX = False
            if board[i][j] == X or board[i][j] == EMPTY:
                wonO = False
        if wonX:
            return X
        if wonO:
            return O

    for j in range(3):
        wonO = True
        wonX = True
        for i in range(3):
            if board[i][j] == X or board[i][j] == EMPTY:
                wonO = False
            if board[i][j] == O or board[i][j] == EMPTY:
                wonX = False
        if wonX:
            return X
        if wonO:
            return O

    diag1 = ''
    diag2 = ''
    j = 2

    for i in range(3):
      diag1 += str(board[i][i])
      diag2 += str(board[i][j])
      j -= 1

    if diag1 == 'XXX' or diag2 == 'XXX':
      return X
    elif diag1 == 'OOO' or diag2 == 'OOO':
      return O


def terminal(board):
    """
    Returns True if game is over, False otherwise.
    """
    if winner(board) == X:
        return True
    elif winner(board) == O:
        return True

    for i in range(len(board)):
        for j in range(len(board[i])):
            if board[i][j] == EMPTY:
                return False
    return True    

    
def utility(board):
    """
    Returns 1 if X has won the game, -1 if O has won, 0 otherwise.
    """
    resB = winner(board)
    if resB == X:
        return 1
    elif resB == O:
        return -1
    else:
        return 0


def minimax(board):
    """
    Returns the optimal action for the current player on the board.
    """
    if terminal(board):
        return None
    Max = float("-inf")
    Min = float("inf")

    if player(board) == X:
        return Max_Value(board, Max, Min)[1]
    else:
        return Min_Value(board, Max, Min)[1]

def Max_Value(board, Max, Min):
    move = None
    if terminal(board):
        return [utility(board), None]
    v = float('-inf')
    for action in actions(board):
        test = Min_Value(result(board, action), Max, Min)[0]
        Max = max(Max, test)
        if test > v:
            v = test
            move = action
        if Max >= Min:
            break
    return [v, move]

def Min_Value(board, Max, Min):
    move = None
    if terminal(board):
        return [utility(board), None]
    v = float('inf')
    for action in actions(board):
        test = Max_Value(result(board, action), Max, Min)[0]
        Min = min(Min, test)
        if test < v:
            v = test
            move = action
        if Max >= Min:
            break
    return [v, move]

here is the situation (computer played as O) :
picture of 5th move
the optimal solution is the bottom of the middle cell
but it chooses this : picture of 6th move
The computer finally wins but not in the optimal way

why minimax don’t choose the optimal solution ?
and how can i fix it?

Asked By: Goldenhat

||

Source

Answer 1

I have not looked at whether your code correctly implements minimax, but I can explain why such results are to be expected.

There may be several paths through the game tree that lead to a node with the same utility value. The minimax algorithm does not make a distinction between quick wins and slow wins; it takes any path that results in a guaranteed win.

A common way to solve this is to assign a lower utility to slower wins. For example, set the utility of a win to 1000 - depth. Conversely, the utility of a loss should be set to -1000 + depth to also make the algorithm prefer drawing out an unavoidable loss for as long as possible. (It’s also good to keep the evaluation function symmetric in case you want to use negamax.)

Answered By: Thomas

Answer 2

you can add a win check before running the minimax algorithm, so when there is a possible win move it should choose it

Answered By: Abdullah0f

why minimax don't choose the optimal solution in this situation

Question:

Answers: