Multiple constructors: the Pythonic way?

Question:

I have a container class that holds data. When the container is created, there are different methods to pass data.

  1. Pass a file which contains the data
  2. Pass the data directly via arguments
  3. Don’t pass data; just create an empty container

In Java, I would create three constructors. Here’s how it would look like if it were possible in Python:

class Container:

    def __init__(self):
        self.timestamp = 0
        self.data = []
        self.metadata = {}

    def __init__(self, file):
        f = file.open()
        self.timestamp = f.get_timestamp()
        self.data = f.get_data()
        self.metadata = f.get_metadata()

    def __init__(self, timestamp, data, metadata):
        self.timestamp = timestamp
        self.data = data
        self.metadata = metadata

In Python, I see three obvious solutions, but none of them is pretty:

A: Using keyword arguments:

def __init__(self, **kwargs):
    if 'file' in kwargs:
        ...
    elif 'timestamp' in kwargs and 'data' in kwargs and 'metadata' in kwargs:
        ...
    else:
        ... create empty container

B: Using default arguments:

def __init__(self, file=None, timestamp=None, data=None, metadata=None):
    if file:
        ...
    elif timestamp and data and metadata:
        ...
    else:
        ... create empty container

C: Only provide constructor to create empty containers. Provide methods to fill containers with data from different sources.

def __init__(self):
    self.timestamp = 0
    self.data = []
    self.metadata = {}

def add_data_from_file(file):
    ...

def add_data(timestamp, data, metadata):
    ...

Solutions A and B are basically the same. I don’t like doing the if/else, especially since I have to check if all arguments required for this method were provided. A is a bit more flexible than B if the code is ever to be extended by a fourth method to add data.

Solution C seems to be the nicest, but the user has to know which method he requires. For example: he cant do c = Container(args) if he doesn’t know what args is.

Whats the most Pythonic solution?

Asked By: Johannes

||

Answers:

You can’t have multiple methods with same name in Python. Function overloading – unlike in Java – isn’t supported.

Use default parameters or **kwargs and *args arguments.

You can make static methods or class methods with the @staticmethod or @classmethod decorator to return an instance of your class, or to add other constructors.

I advise you to do:

class F:

    def __init__(self, timestamp=0, data=None, metadata=None):
        self.timestamp = timestamp
        self.data = list() if data is None else data
        self.metadata = dict() if metadata is None else metadata

    @classmethod
    def from_file(cls, path):
       _file = cls.get_file(path)
       timestamp = _file.get_timestamp()
       data = _file.get_data()
       metadata = _file.get_metadata()       
       return cls(timestamp, data, metadata)

    @classmethod
    def from_metadata(cls, timestamp, data, metadata):
        return cls(timestamp, data, metadata)

    @staticmethod
    def get_file(path):
        # ...
        pass

⚠ Never have mutable types as defaults in python. ⚠
See here.

Answered By: glegoux

What are the system goals for this code? From my standpoint, your critical phrase is but the user has to know which method he requires. What experience do you want your users to have with your code? That should drive the interface design.

Now, move to maintainability: which solution is easiest to read and maintain? Again, I feel that solution C is inferior. For most of the teams with whom I’ve worked, solution B is preferable to A: it’s a little easier to read and understand, although both readily break into small code blocks for treatment.

Answered By: Prune

I’m not sure if I understood right but wouldn’t this work?

def __init__(self, file=None, timestamp=0, data=[], metadata={}):
    if file:
        ...
    else:
        self.timestamp = timestamp
        self.data = data
        self.metadata = metadata

Or you could even do:

def __init__(self, file=None, timestamp=0, data=[], metadata={}):
    if file:
        # Implement get_data to return all the stuff as a tuple
        timestamp, data, metadata = f.get_data()

    self.timestamp = timestamp
    self.data = data
    self.metadata = metadata

Thank to Jon Kiparsky advice theres a better way to avoid global declarations on data and metadata so this is the new way:

def __init__(self, file=None, timestamp=None, data=None, metadata=None):
    if file:
        # Implement get_data to return all the stuff as a tuple
        with open(file) as f:
            timestamp, data, metadata = f.get_data()

    self.timestamp = timestamp or 0
    self.data = data or []
    self.metadata = metadata or {}
Answered By: Gabriel Ecker

The most pythonic way is to make sure any optional arguments have default values. So include all arguments that you know you need and assign them appropriate defaults.

def __init__(self, timestamp=None, data=[], metadata={}):
    timestamp = time.now()

An important thing to remember is that any required arguments should not have defaults since you want an error to be raised if they’re not included.

You can accept even more optional arguments using *args and **kwargs at the end of your arguments list.

def __init__(self, timestamp=None, data=[], metadata={}, *args, **kwards):
    if 'something' in kwargs:
        # do something
Answered By: Soviut

Most Pythonic would be what the Python standard library already does. Core developer Raymond Hettinger (the collections guy) gave a talk on this, plus general guidelines for how to write classes.

Use separate, class-level functions to initialize instances, like how dict.fromkeys() isn’t the class initializer but still returns an instance of dict. This allows you to be flexible toward the arguments you need without changing method signatures as requirements change.

Answered By: Arya McCarthy

You can’t have multiple constructors, but you can have multiple aptly-named factory methods.

class Document(object):

    def __init__(self, whatever args you need):
        """Do not invoke directly. Use from_NNN methods."""
        # Implementation is likely a mix of A and B approaches. 

    @classmethod
    def from_string(cls, string):
        # Do any necessary preparations, use the `string`
        return cls(...)

    @classmethod
    def from_json_file(cls, file_object):
        # Read and interpret the file as you want
        return cls(...)

    @classmethod
    def from_docx_file(cls, file_object):
        # Read and interpret the file as you want, differently.
        return cls(...)

    # etc.

You can’t easily prevent the user from using the constructor directly, though. (If it is critical, as a safety precaution during development, you can analyze the call stack in the constructor and check that the call is made from one of the expected methods.)

Answered By: 9000

If you are on Python 3.4+ you can use the functools.singledispatch decorator to do this (with a little extra help from the methoddispatch decorator that @ZeroPiraeus wrote for his answer):

class Container:

    @methoddispatch
    def __init__(self):
        self.timestamp = 0
        self.data = []
        self.metadata = {}

    @__init__.register(File)
    def __init__(self, file):
        f = file.open()
        self.timestamp = f.get_timestamp()
        self.data = f.get_data()
        self.metadata = f.get_metadata()

    @__init__.register(Timestamp)
    def __init__(self, timestamp, data, metadata):
        self.timestamp = timestamp
        self.data = data
        self.metadata = metadata
Answered By: Sean Vieira