Python: Mutables in class states, what's going on under the hood?

Question:

I’m currently doing an online course on OOP in Python and one of the skill tests was to write a password generator class. Below you’ll find the recommended answer.

import re
import random
from string import ascii_letters, punctuation
from copy import copy

class Password:
    
    SAMPLES = {
        "letters": list(ascii_letters),
        "numbers": list(range(10)),
        "punctuation": list(punctuation)
    }
    
    DEFAULT_SETTINGS = {
        "low": 8,
        "mid": 12,
        "high": 16
    }
    
    @classmethod
    def show_input_universe(cls):
        return cls.SAMPLES
    
    
    def _generate_password(self):
        
        population = self.SAMPLES["letters"]
        length = self.length or self.DEFAULT_SETTINGS.get(self.strength)
        
        if self.strength == "high":
            population += self.SAMPLES["numbers"] + self.SAMPLES["punctuation"]
        elif self.strength == "mid":
            population += self.SAMPLES["numbers"]
        else:
            pass
        
        # map(lambda x: str(x), random...)
        self.password = "".join(map(str, random.choices(population, k=int(length))))
    
    
    def __init__(self, strength="mid", length=None): #None so there's no need for a value
        self.strength = strength
        self.length = length
        
        self._generate_password()

When using that code to create instances of that class and thus generate passwords that are either of high or mid security the underlying SAMPLES["letter] list is modified. That leads to low strength passwords sharing all the properties of the other password strenghts.

I could even see that effect when calling the show_input_universe method. The other lists were added to the original list.

But why is that?

I understand that population = self.SAMPLES["letters"] creates the population variable which stores a pointer to the self.SAMPLES["letters"] list.

But then how exactly does the concatenation work:

if self.strength == "high":
    population += self.SAMPLES["numbers"] + self.SAMPLES["punctuation"]

The solution to this is the following:

from copy import copy

population = Copy(self.SAMPLES["letters"])

As that creates a copy of the initial List that is only modified within the specific instances.

Asked By: maosi100

||

Answers:

The line:

population += self.SAMPLES["numbers"] + self.SAMPLES["punctuation"]

does not just rebind the population variable, it mutates the object that population currently references, which is self.SAMPLES["letters"]. It’s the same as doing:

population.extend(self.SAMPLES["numbers"] + self.SAMPLES["punctuation"])

or:

self.SAMPLES["letters"].extend(self.SAMPLES["numbers"] + self.SAMPLES["punctuation"])

Note that you don’t need to import anything special to make a copy of a list; you can just use the list constructor to make a new list out of any iterable (including another list):

population = list(self.SAMPLES["letters"])

or you can use the built-in list.copy method:

population = self.SAMPLES["letters"].copy()
Answered By: Samwise

The problem is here:

population = self.SAMPLES["letters"]

In Python, variables are references that point to a chunk of data*. A reference in this case is like a sticky note with a name written on it. You can attach several of them to the same chunk of data, and access that same chunk of data by any of the names of the variables that point to it. Assigning something to a different variable never** makes a copy.

Thus, the population variable is just a reference to the exact same list as self.SAMPLES["letters"]. So when you mutate population, you are also mutating self.SAMPLES["letters"]. This is why making a copy fixes the problem: because now they are two separate lists, so mutating one does not mutate the other.

As shown in the other answer, you can also make a copy of the list by invoking list() on it, or indexing with [:]. All of these are equivalent:

population = list(self.SAMPLES["letters"])

population = self.SAMPLES["letters"][:]

from copy import copy
population = copy(self.SAMPLES["letters"])

*There are some exceptions in specific cases, but this is the correct mental model in general.

**Again, exceptions are possible, but none that are relevant here.

Answered By: shadowtalker

The type of population is list. List implements the datamodel hook __iadd__ for in-place sequence concatenation. This method also returns the existing instance:

>>> population = [0, 1]
>>> new = population.__iadd__([2, 3])
>>> new is population
True
>>> population
[0, 1, 2, 3]

So, the augmented assignment statement += mutates the original list, which is one of the values of the dict in the class namespace. It’s in Password.SAMPLES – shared between all instances. Even though population is a local variable inside your method, there is still shared state.

Note that if list did not implement __iadd__, then your code would work as you expected because the augmented assigment would fall back to using __add__, concatenating and returning a new instance. Replace population += other with population = population + other to see similar.

Answered By: wim
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.