What is the Python way of chaining maps and filters?

Question:

I’m currently learning Python (coming from other languages like JavaScript and Ruby). I am very used to chaining a bunch of transformations / filters, but I’m pretty sure that’s not the right way to do it in Python: filter takes a lambda before the enumerable, so writing a long / multi-line function looks really weird and chaining them means putting them in reverse order which isn’t readable.

What would be the “Python way” of writing the maps and filters in this JavaScript function?

let is_in_stock = function() /* ... */
let as_item = function() /* ... */

let low_weight_items = shop.inventory
    .map(as_item)
    .filter(is_in_stock)
    .filter(item => item.weight < 1000)
    .map(item => {
        if (item.type == "cake") {
            let catalog_item = retrieve_catalog_item(item.id);

            return {
                id: item.id,
                weight: item.weight,
                barcode: catalog_item.barcode
            };
        } else {
            return default_transformer(item);
        }
    });

I understand that I might use a list comprehension for the first map and the next two filters, but I am not sure how to do the last map and how to put everything together.

Thank you!

Asked By: Edward

||

Answers:

One good way to do this is to combine multiple filters/maps into a single generator comprehension. In cases where this can’t be done, define an intermediate variable for the intermediate map/filter you need, instead of trying to force the maps into a single chain. For instance:

def is_in_stock(x):
   # ...
def as_item(x):
   # ...
def transform(item):
    if item.type == "cake":
        catalog_item = retrieve_catalog_item(item.id)
        return {
            "id": item.id,
            "weight": item.weight,
            "barcode": catalog_item.barcode
        }
    else:
        return default_transformer(item)

items = (as_item(item) for item in shop.inventory)
low_weight_items = (transform(item) for item in items if is_in_stock(item) and item.weight < 1000)

Note that the actual application of the maps and filters is all done in the last two lines. The earlier part is just defining the functions that encode the maps and filters.

The second generator comprehension does the last two filters and the map all together. Using generator comprehensions means that each original item in inventory will be mapped/filtered lazily. It won’t pre-process the entire list, so it is likely to perform better if the list is large.

Note that there is no Python equivalent to defining long functions inline as in your JavaScript example. You can’t specify that complex filter (the one with item.type == "cake") inline. Instead, as shown in my example, you must define it as a separate function, just as you did with is_in_stock and as_item.

(The reason the first map was split is that later filters can’t act on the mapped data until after it’s mapped. It could be combined into one, but that would require manually redoing the as_item map inside the comprehension:

low_weight_items = (transform(as_item(item)) for item in items if is_in_stock(as_item(item)) and as_item(item).weight < 1000)

It’s clearer to just separate out that map.)

Answered By: BrenBarn

use iterators (in python 3 all those functions are iterators in python2 you need to use itertools.imap and itertools.ifilter)

m = itertools.imap
f = itertools.ifilter
def final_map_fn(item):
   if (item.type == "cake"):
        catalog_item = retrieve_catalog_item(item.id);
        return {
            "id": item.id,
            "weight": item.weight,
            "barcode": catalog_item.barcode}
    else:
        return default_transformer(item)

items = m(as_item,shop.inventory) #note it does not loop it yet
instockitems = f(is_in_stock,items) #still hasnt actually looped anything
weighteditems = (item for item instockitems if item.weight < 100) #still no loop (this is a generator)
final_items = m(final_map_fn,weighteditems) #still has not looped over a single item in the list
results = list(final_items) #evaluated now with a single loop
Answered By: Joran Beasley

Defining your own functional composition meta-function is pretty easy. Once you have that, chaining functions together is also very easy.

import functools
def compose(*functions):
    return functools.reduce(lambda f, g: lambda x: f(g(x)), functions)
def make_filter(filter_fn):
    return lambda iterable: (it for it in iterable if filter_fn(it))

pipeline = compose(as_item, make_filter(is_in_stock),
                   make_filter(lambda item: item.weight < 1000),
                   lambda item: ({'id': item.id,
                                 'weight': item.weight,
                                 'barcode': item.barcode} if item.type == "cake"
                                 else default_transformer(item)))
pipeline(shop.inventory)

If you’re not already familiar with iterators, I would brush up on it if I were you (something like http://excess.org/article/2013/02/itergen1/ is good).

Answered By: metaperture

If you don’t mind using a package, this is another way to do it using https://github.com/EntilZha/PyFunctional

from functional import seq

def as_item(x):
    # Implementation here
    return x

def is_in_stock(x):
    # Implementation
    return True

def transform(item):
    if item.type == "cake":
        catalog_item = retrieve_catalog_item(item.id);
        return {
            'id': item.id,
            'weight': item.weight,
            'barcode': catalog_item.barcode
        }
    else:
        return default_transformer(item)

low_weight_items = seq(inventory)
    .map(as_item)
    .filter(is_in_stock)
    .filter(lambda item: item.weight < 1000)
    .map(transform)

As mentioned earlier, python lets you use lamdba expressions, but they aren’t flexible as clojures in javascript since they can’t have more than one statement. Another annoying python thing are the need for backslashes. That being said, I think the above most closely mirrors what you originally posted.

Disclaimer: I am the author of the above package

Answered By: EntilZha
def is_in_stock():
    ...

def as_item():
    ...

def get_low_weight_items(items):
    for item in items:
        item = as_item(item)
        if not is_in_stock(item):
            continue
        if item.weight < 1000:
            if item.type == "cake":
                catalog_item = retrieve_catalog_item(item.id)
                yield {
                    "id": item.id,
                    "weight": item.weight,
                    "barcode": catalog_item.barcode,
                }
            else:
                yield default_transformer(item)


low_weight_items = list(get_low_weight_items(items))
Answered By: Oleh Prypin

We can use Pyterator: https://github.com/remykarem/pyterator (disclaimer: I’m the author). It’s very similar to @EntilZha’s library.

pip install git+https://github.com/remykarem/pyterator#egg=pyterator

Define functions

def is_in_stock(x):
    pass

def as_item(x):
    pass

def transform_cake_noncake(item):
    pass

then

from pyterator import iterate

low_weight_items = (
    iterate(shop.inventory)
    .map(as_item)
    .filter(is_in_stock)
    .filter(lambda item: item.weight < 1000)
    .map(transform_cake_noncake)
    .to_list()
)

Note that all the operations like map and filter are lazily evaluated. So you need to call .to_list() to evaluate the pipeline. Otherwise, it remains as an Iterator (which you can later use in a for-loop etc.).

Also similar to Fluentpy (https://github.com/dwt/fluent).

Answered By: remykarem

You can somewhat achieve this using the walrus operator in a generator comprehension.

low_weight_items = (
    z
    for x in [
        Item(1, 100, "cake"),
        Item(2, 1000, "cake"),
        Item(3, 900, "cake"),
        Item(4, 10000, "cake"),
        Item(5, 100, "bread"),
    ]
    if (y := as_item(x))
    if is_in_stock(y)
    if y.weight < 1000
    if (z := transform(y))
)

But you have to assign different variables (x/y/z in the example) as you can’t assign to an existing variable with the walrus operator.


Full example

def as_item(x):
    return x

def is_in_stock(x):
    return True

class Item:
    def __init__(self, id, weight, type):
        self.id = id
        self.weight = weight
        self.type = type

class CatalogItem:
    def __init__(self, id, barcode):
        self.id = id
        self.barcode = barcode

def retrieve_catalog_item(id):
    return CatalogItem(id, "123456789")

def default_transformer(item):
    return item

def transform(item):
    if item.type == "cake":
        catalog_item = retrieve_catalog_item(item.id)
        return {
            'id': item.id,
            'weight': item.weight,
            'barcode': catalog_item.barcode,
        }
    else:
        return default_transformer(item)

low_weight_items = (
    z
    for x in [
        Item(1, 100, "cake"),
        Item(2, 1000, "cake"),
        Item(3, 900, "cake"),
        Item(4, 10000, "cake"),
        Item(5, 100, "bread"),
    ]
    if (y := as_item(x))
    if is_in_stock(y)
    if y.weight < 1000
    if (z := transform(y))
)

for item in low_weight_items:
    print(item)
Answered By: CervEd

from functools import reduce

class my_list(list):
    def filter(self, func):
        return my_list(filter(func, self))
    def map(self, func):
        return my_list(map(func, self))
    def reduce(self, func):
        return reduce(func, self)

temp = my_list([1,2,3,4,5,6,7])
print(temp.filter(lambda n: n%2==0).map(lambda n: n*2).reduce(lambda a,b: a+b))

You can use inheritance to achieve the same thing in python, if you want to use inbuilt filter, map and reduce methods.
Here I have created a class called my_list which extends the class list. I will wrap my list with my_list and then use map, filter and reduce defined from my class by passing a function as a parameter.

I know that this will be creating a fresh object every time these three methods are invoked. If there is any way to bypass multiple object creation do let me know.

Answered By: Sahil Yadav

You could also create your own class like this.
You pass an iterable item to this stream class and create methods that delegates all needed operations to existing map, filter functions and the like.

class stream:
    def __init__(self, iterable):
        try:
            self.iterator = iter(iterable)
        except Exception:
            raise TypeError(f'{iterable} is not iterable but {type(iterable)}')

    def map(self, f):
        self.iterator = map(f, self.iterator)
        return self

    def filter(self, f):
        self.iterator = filter(f, self.iterator)
        return self

    def foreach(self, f):
        for x in self.iterator:
            f(x)

if __name__ == '__main__':
    stream([1,2,3,4,5]).map(lambda x: x*2)
                       .filter(lambda x:x>4)
                       .foreach(lambda x: print(x))
Answered By: Marko
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.