List is getting changed when manipulated inside function

Question:

I have a function that is designed to recursively look for values in an array of objects and return a string of all variables with a similar y0. This all works fine, however, when I manipulate the array it manipulates the array that has been inputted into it, despite the fact that I make a copy of the array to prevent this issue.

That means that when you run the code given, it changes tmp to have different text values. I know the error is in line 26 when it sets BOLD_OBJ["text"] to the output of the recursive function, however I’m not sure as to why considering it should be manipulating the copy of the array.

def recursiveScanText(BOLD_OBJ_LIST:list, Y_VALUE: int, output: list):
    if BOLD_OBJ_LIST[0]["y0"] == Y_VALUE:
        output.append(BOLD_OBJ_LIST[0]["text"])
        BOLD_OBJ_LIST.pop(0)
        if BOLD_OBJ_LIST == []:
            return output
        output = recursiveScanText(BOLD_OBJ_LIST, Y_VALUE, output)
        return output
    else:
        return output
 
def mergeSimilarText(BOLD_OBJ_LIST: list):
    """Merges the objects of a list of objects if they are at a similar (±5) Y coordinate"""
    OUTPUT = []
    RECURSIVE_SCAN_OUTPUT = []
    BOLD_OBJ_LIST = BOLD_OBJ_LIST.copy()
 
    for BOLD_OBJ_INDEX in range(len(BOLD_OBJ_LIST)):
        if len(BOLD_OBJ_LIST) > 0 and BOLD_OBJ_INDEX < len(BOLD_OBJ_LIST):
            BOLD_OBJ = BOLD_OBJ_LIST[0]
 
            BOLD_CHAR_STRING = recursiveScanText(BOLD_OBJ_LIST, BOLD_OBJ_LIST[BOLD_OBJ_INDEX]["y0"], RECURSIVE_SCAN_OUTPUT)
 
            RECURSIVE_SCAN_OUTPUT = []
 
            BOLD_OBJ["text"] = "".join(BOLD_CHAR_STRING)
            OUTPUT.append(BOLD_OBJ)
 
    return OUTPUT
 
tmp = [
{'y0': 762.064, 'text': '177'}, 
{'y0': 762.064,  'text': '7'}, 
{'y0': 114.8281, 'text': 'Q'}, 
{'y0': 114.8281, 'text': 'u'}, 
{'y0': 114.8281, 'text': 'e'}, 
{'y0': 114.8281, 'text': 's'}, 
{'y0': 114.8281, 'text': 't'}, 
{'y0': 114.8281, 'text': 'i'}, 
{'y0': 114.8281, 'text': 'o'}, 
{'y0': 114.8281, 'text': 'n'}, 
{'y0': 114.8281, 'text': ' '}, 
{'y0': 114.8281, 'text': '1'}, 
{'y0': 114.8281, 'text': '7'}, 
{'y0': 114.8281, 'text': ' '}, 
{'y0': 114.8281, 'text': 'c'}, 
{'y0': 114.8281, 'text': 'o'}, 
{'y0': 114.8281, 'text': 'n'}, 
{'y0': 114.8281, 'text': 't'}, 
{'y0': 114.8281, 'text': 'i'}, 
{'y0': 114.8281, 'text': 'n'}, 
{'y0': 114.8281, 'text': 'u'}, 
{'y0': 114.8281, 'text': 'e'}, 
{'y0': 114.8281, 'text': 's'}, 
{'y0': 114.8281, 'text': ' '}, 
{'y0': 114.8281, 'text': 'o'}, 
{'y0': 114.8281, 'text': 'n'}, 
{'y0': 114.8281, 'text': ' '}, 
{'y0': 114.8281, 'text': 'p'}, 
{'y0': 114.8281, 'text': 'a'}, 
{'y0': 114.8281, 'text': 'g'}, 
{'y0': 114.8281, 'text': 'e'}, 
{'y0': 114.8281, 'text': ' '}, 
{'y0': 114.8281,'text': '9'}]
 
print(mergeSimilarText(tmp))
print(tmp)

Some notes: I have tried changing BOLD_OBJ_LIST = BOLD_OBJ_LIST.copy() to tmp = BOLD_OBJ_LIST.copy() and that still doesn’t fix it. Also, I don’t need to deepcopy as it is an array of dicts not an array of arrays

Asked By: Toby Clark

||

Answers:

copy makes a shallow copy of your list, i.e. if you have mutable elements in that list, you only copy the reference.

You need to use deepcopy:

from copy import deepcopy

BOLD_OBJ_LIST = deepcopy(BOLD_OBJ_LIST)

This will recursively create a copy of all your elements.

Answered By: Wolric