Python dataclasses.dataclass reference to variable instead of instance variable

Question

The default values in constructors for c1 and c2 should produce new instance variables for b and b. Instead, it looks like c1.a and c2.a are referencing the same variable. Is @dataclass creating a class variable? That does not seem to be consistent with the intended functionality, and I cannot find anything about class variables in the documentation. So, I think this is a bug. Can someone explain to me how to fix it? Should I report it as a bug on the python tracker?

I know this issue must be related to the way python passes objects by reference and built-in types by value since the b attribute (which is just a float) shows the expected/desired behavior while the a attribute (which is a user-defined object) is just a reference.

Thanks!

Code:

from dataclasses import dataclass

@dataclass
class VS:
    v: float  # value
    s: float  # scale factor
    
    def scaled_value(self):
        return self.v*self.s

@dataclass
class Container:
    a: VS = VS(1, 1)
    b: float = 1

c1 = Container()
c2 = Container()

print(c1)
print(c2)

c1.a.v = -999
c1.b = -999

print(c1)
print(c2)

Ouputs:

Container(a=VS(v=1, s=1), b=1)
Container(a=VS(v=1, s=1), b=1)
Container(a=VS(v=-999, s=1), b=-999)
Container(a=VS(v=-999, s=1), b=1)

Asked By: Tal

||

Source

Answer 1

Thanks Eric S for providing an explanation:

c1 and c2 share the same instance of a. This is the mutable default argument problem: https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments
Use a default_factory to create a new VS for each container.

The default_factory does not allow me to have a unique set of default VS values for multiple attributes since the VS defaults will need to be defined in the VS dataclass. For example, if I wanted a to default to VS(1,1) but I wanted b to default to VS(1,2), the default_factory does not help me. So, I found I workaround, which is to create a dict of keyword entries and pass a deepcopy into my Container() constructer (note, that if I do not pass a deep copy, I get the same issue as above). Here is my final code snippet and the output:

Code:

from dataclasses import dataclass, field
from copy import deepcopy

@dataclass
class VS:
    v: float = 1 # value
    s: float = 1 # scale factor
    
    def scaled_value(self):
        return self.v*self.s

@dataclass
class Container:
    a: VS = field(default_factory=VS)
    b: float = 1

ip = {'a':VS(2,1),'b':1}
c1 = Container(**deepcopy(ip))
c2 = Container(**deepcopy(ip))

print(c1)
print(c2)

c1.a.v = 0
c1.b = 0

print(c1)
print(c2)

Output:

Container(a=VS(v=2, s=1), b=1)
Container(a=VS(v=2, s=1), b=1)
Container(a=VS(v=0, s=1), b=0)
Container(a=VS(v=2, s=1), b=1)

Answered By: Tal

Answer 2

In the OP’s original example, a single VS object is created when the Container class is defined. That object is then shared across all instances of the Container class. This is a problem because user-defined classes such as VS result in a mutable objects. Thus, changing a in any Container object will change a in all other Container objects

You want to generate a new VS object every time a Container class is instantiated at initialization time. To do this, using the default_factory of the field function is a good way to go about it. Passing a lambda function allows all this to be done inline.

I added a c member variable to container with another VS class to illustrate that the members are independent when done this way.

from dataclasses import dataclass, field

@dataclass
class VS:
    v: float  # value
    s: float  # scale factor
    
    def scaled_value(self):
        return self.v*self.s

# Use a zero-argument lambda function for default factor function.      
@dataclass
class Container:
    a: VS = field(default_factory= lambda:VS(1,1) )
    b: float = 1
    c: VS = field(default_factory= lambda:VS(1,2) )

c1 = Container()
c2 = Container()

print(c1)
print(c2)

c1.a.v = -999
c1.c.s = -999

print(c1)
print(c2)

Output:

Container(a=VS(v=1, s=1), b=1, c=VS(v=1, s=2))
Container(a=VS(v=1, s=1), b=1, c=VS(v=1, s=2))
Container(a=VS(v=-999, s=1), b=1, c=VS(v=1, s=-999))
Container(a=VS(v=1, s=1), b=1, c=VS(v=1, s=2))

Answered By: Caleb

Python dataclasses.dataclass reference to variable instead of instance variable

Question:

Code:

Ouputs:

Answers:

Thanks Eric S for providing an explanation:

Code:

Output: