How to chain attribute lookups that might return None in Python?

Question:

My problem is a general one, how to chain a series of attribute lookups when one of the intermediate ones might return None, but since I ran into this problem trying to use Beautiful Soup, I’m going to ask it in that context.

Beautiful Soup parses an HTML document and returns an object that can be used to access the structured content of that document. For example, if the parsed document is in the variable soup, I can get its title with:

title = soup.head.title.string

My problem is that if the document doesn’t have a title, then soup.head.title returns None and the subsequent string lookup throws an exception. I could break up the chain as:

x = soup.head
x = x.title if x else None
title = x.string if x else None

but this, to my eye, is verbose and hard to read.

I could write:

title = soup.head and soup.head.title and soup.title.head.string

but that is verbose and inefficient.

One solution if thought of, which I think is possible, would be to create an object (call it nil) that would return None for any attribute lookup. This would allow me to write:

title = ((soup.head or nil).title or nil).string

but this is pretty ugly. Is there a better way?

Asked By: David Hull

||

Answers:

You might be able to use reduce for this:

>>> class Foo(object): pass
... 
>>> a = Foo()
>>> a.foo = Foo()
>>> a.foo.bar = Foo()
>>> a.foo.bar.baz = Foo()
>>> a.foo.bar.baz.qux = Foo()
>>> 
>>> reduce(lambda x,y:getattr(x,y,''),['foo','bar','baz','qux'],a)
<__main__.Foo object at 0xec2f0>
>>> reduce(lambda x,y:getattr(x,y,''),['foo','bar','baz','qux','quince'],a)
''

In python3.x, I think that reduce is moved to functools though 🙁


I suppose you could also do this with a simpler function:

def attr_getter(item,attributes)
    for a in attributes:
        try:
            item = getattr(item,a)
        except AttributeError:
            return None #or whatever on error
    return item

Finally, I suppose the nicest way to do this is something like:

try:
   title = foo.bar.baz.qux
except AttributeError:
   title = None
Answered By: mgilson

The most straightforward way is to wrap in a tryexcept block.

try:
    title = soup.head.title.string
except AttributeError:
    print "Title doesn't exist!"

There’s really no reason to test at each level when removing each test would raise the same exception in the failure case. I would consider this idiomatic in Python.

Answered By: jeffknupp

One solution would be to wrap the outer object inside a Proxy that handles None values for you. See below for a beginning implementation.

import unittest

class SafeProxy(object):

    def __init__(self, instance):
        self.__dict__["instance"] = instance

    def __eq__(self, other):
        return self.instance==other

    def __call__(self, *args, **kwargs):
        return self.instance(*args, **kwargs)

    # TODO: Implement other special members

    def __getattr__(self, name):
        if hasattr(self.__dict__["instance"], name):
            return SafeProxy(getattr(self.instance, name))

        if name=="val":
            return lambda: self.instance

        return SafeProxy(None)

    def __setattr__(self, name, value):
        setattr(self.instance, name, value)


# Simple stub for creating objects for testing
class Dynamic(object):
    def __init__(self, **kwargs):
        for name, value in kwargs.iteritems():
            self.__setattr__(name, value)

    def __setattr__(self, name, value):
        self.__dict__[name] = value


class Test(unittest.TestCase):

    def test_nestedObject(self):
        inner = Dynamic(value="value")
        middle = Dynamic(child=inner)
        outer = Dynamic(child=middle)
        wrapper = SafeProxy(outer)
        self.assertEqual("value", wrapper.child.child.value)
        self.assertEqual(None, wrapper.child.child.child.value)

    def test_NoneObject(self):
        self.assertEqual(None, SafeProxy(None))

    def test_stringOperations(self):
        s = SafeProxy("string")
        self.assertEqual("String", s.title())
        self.assertEqual(type(""), type(s.val()))
        self.assertEqual()

if __name__=="__main__":
    unittest.main()

NOTE: I am personally not sure wether I would use this in an actual project, but it makes an interesting experiment and I put it here to get people thoughts on this.

Answered By: TAS

Here is another potential technique, which hides the assignment of the intermediate value in a method call. First we define a class to hold the intermediate value:

class DataHolder(object):
    def __init__(self, value = None):
            self.v = value

    def g(self):
            return self.v

    def s(self, value):
            self.v = value
            return value

x = DataHolder(None)

Then we get use it to store the result of each link in the chain of calls:

import bs4;

for html in ('<html><head></head><body></body></html>',
             '<html><head><title>Foo</title></head><body></body></html>'):
    soup = bs4.BeautifulSoup(html)
    print x.s(soup.head) and x.s(x.g().title) and x.s(x.g().string)
    # or
    print x.s(soup.head) and x.s(x.v.title) and x.v.string

I don’t consider this a good solution, but I’m including it here for completeness.

Answered By: David Hull

This is how I handled it with inspiration from @TAS and Is there a Python library (or pattern) like Ruby's andand?

class Andand(object):
    def __init__(self, item=None):
        self.item = item

    def __getattr__(self, name):
        try:
            item = getattr(self.item, name)
            return item if name is 'item' else Andand(item)
        except AttributeError:
            return Andand()     

    def __call__(self):
        return self.item


title = Andand(soup).head.title.string()
Answered By: reubano

My best shoot to handle middle-way null attributes like this is to use pydash as sample code on repl.it here

import pydash
title = pydash.get(soup, 'head.title.string', None)
Answered By: Nam G VU

I’m running Python 3.9

Python 3.9.2 (tags/v3.9.2:1a79785, Feb 19 2021, 13:44:55) [MSC v.1928 64 bit (AMD64)]

and the and key word solves my problem

memo[v] = short_combo and short_combo.copy()

From what I gather this is not pythonic and you should handle the exception.
However in my solution None ambiguity exists within the function, and in this scenario I would think it to be a poor practice to handle exceptions that occur ~50% of the time.
Where I outside of the function and calling it I would handle the exception.

Answered By: Bardia 'Luviz' Jedi
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.