Lazy loading of class attributes

Question:

Class Foo has a bar, and it is not loaded until it is accessed. Further accesses to bar should incur no overhead.

class Foo(object):

    def get_bar(self):
        print "initializing"
        self.bar = "12345"
        self.get_bar = self._get_bar
        return self.bar

    def _get_bar(self):
        print "accessing"
        return self.bar

Is it possible to do something like this using properties or, better yet, attributes, instead of using a getter method?

The goal is to lazy load without overhead on all subsequent accesses…

Asked By: whats canasta

||

Answers:

Sure, just have your property set an instance attribute that is returned on subsequent access:

class Foo(object):
    _cached_bar = None 

    @property
    def bar(self):
        if not self._cached_bar:
            self._cached_bar = self._get_expensive_bar_expression()
        return self._cached_bar

The property descriptor is a data descriptor (it implements __get__, __set__ and __delete__ descriptor hooks), so it’ll be invoked even if a bar attribute exists on the instance, with the end result that Python ignores that attribute, hence the need to test for a separate attribute on each access.

You can write your own descriptor that only implements __get__, at which point Python uses an attribute on the instance over the descriptor if it exists:

class CachedProperty(object):
    def __init__(self, func, name=None):
        self.func = func
        self.name = name if name is not None else func.__name__
        self.__doc__ = func.__doc__

    def __get__(self, instance, class_):
        if instance is None:
            return self
        res = self.func(instance)
        setattr(instance, self.name, res)
        return res

class Foo(object):
    @CachedProperty
    def bar(self):
        return self._get_expensive_bar_expression()

If you prefer a __getattr__ approach (which has something to say for it), that’d be:

class Foo(object):
    def __getattr__(self, name):
        if name == 'bar':
            bar = self.bar = self._get_expensive_bar_expression()
            return bar
        return super(Foo, self).__getattr__(name)

Subsequent access will find the bar attribute on the instance and __getattr__ won’t be consulted.

Demo:

>>> class FooExpensive(object):
...     def _get_expensive_bar_expression(self):
...         print 'Doing something expensive'
...         return 'Spam ham & eggs'
... 
>>> class FooProperty(FooExpensive):
...     _cached_bar = None 
...     @property
...     def bar(self):
...         if not self._cached_bar:
...             self._cached_bar = self._get_expensive_bar_expression()
...         return self._cached_bar
... 
>>> f = FooProperty()
>>> f.bar
Doing something expensive
'Spam ham & eggs'
>>> f.bar
'Spam ham & eggs'
>>> vars(f)
{'_cached_bar': 'Spam ham & eggs'}
>>> class FooDescriptor(FooExpensive):
...     bar = CachedProperty(FooExpensive._get_expensive_bar_expression, 'bar')
... 
>>> f = FooDescriptor()
>>> f.bar
Doing something expensive
'Spam ham & eggs'
>>> f.bar
'Spam ham & eggs'
>>> vars(f)
{'bar': 'Spam ham & eggs'}

>>> class FooGetAttr(FooExpensive):
...     def __getattr__(self, name):
...         if name == 'bar':
...             bar = self.bar = self._get_expensive_bar_expression()
...             return bar
...         return super(Foo, self).__getatt__(name)
... 
>>> f = FooGetAttr()
>>> f.bar
Doing something expensive
'Spam ham & eggs'
>>> f.bar
'Spam ham & eggs'
>>> vars(f)
{'bar': 'Spam ham & eggs'}
Answered By: Martijn Pieters

Sure it is, try:

class Foo(object):
    def __init__(self):
        self._bar = None # Initial value

    @property
    def bar(self):
        if self._bar is None:
            self._bar = HeavyObject()
        return self._bar

Note that this is not thread-safe. cPython has GIL, so it’s a relative issue, but if you plan to use this in a true multithread Python stack (say, Jython), you might want to implement some form of lock safety.

Answered By: Stefano Sanfilippo

There are some problems with the current answers. The solution with a property requires that you specify an additional class attribute and has the overhead of checking this attribute on each look up. The solution with __getattr__ has the issue that it hides this attribute until first access. This is bad for introspection and a workaround with __dir__ is inconvenient.

A better solution than the two proposed ones is utilizing descriptors directly. The werkzeug library has already a solution as werkzeug.utils.cached_property. It has a simple implementation so you can directly use it without having Werkzeug as dependency:

_missing = object()

class cached_property(object):
    """A decorator that converts a function into a lazy property.  The
    function wrapped is called the first time to retrieve the result
    and then that calculated result is used the next time you access
    the value::

        class Foo(object):

            @cached_property
            def foo(self):
                # calculate something important here
                return 42

    The class has to have a `__dict__` in order for this property to
    work.
    """

    # implementation detail: this property is implemented as non-data
    # descriptor.  non-data descriptors are only invoked if there is
    # no entry with the same name in the instance's __dict__.
    # this allows us to completely get rid of the access function call
    # overhead.  If one choses to invoke __get__ by hand the property
    # will still work as expected because the lookup logic is replicated
    # in __get__ for manual invocation.

    def __init__(self, func, name=None, doc=None):
        self.__name__ = name or func.__name__
        self.__module__ = func.__module__
        self.__doc__ = doc or func.__doc__
        self.func = func

    def __get__(self, obj, type=None):
        if obj is None:
            return self
        value = obj.__dict__.get(self.__name__, _missing)
        if value is _missing:
            value = self.func(obj)
            obj.__dict__[self.__name__] = value
        return value
Answered By: schlamar
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.