# Python list comprehensions to create multiple lists

## Question:

I want to create two lists listOfA and listOfB to store indices of A and B from another list s.

s=['A','B','A','A','A','B','B']

Output should be two lists

listOfA=[0,2,3,4]
listOfB=[1,5,6]

I am able to do this with two statements.

listOfA=[idx for idx,x in enumerate(s) if x=='A']
listOfB=[idx for idx,x in enumerate(s) if x=='B']

However, I want to do it in only one iteration using list comprehensions only.
Is it possible to do it in a single statement?
something like listOfA,listOfB=[--code goes here--]

Sort of; the key is to generate a 2-element list that you can then unpack:

listOfA, listOfB = [[idx for idx, x in enumerate(s) if x == c] for c in 'AB']

That said, I think it’s pretty daft to do it that way, an explicit loop is much more readable.

The very definition of a list comprehension is to produce one list object. Your 2 list objects are of different lengths even; you’d have to use side-effects to achieve what you want.

Don’t use list comprehensions here. Just use an ordinary loop:

listOfA, listOfB = [], []

for idx, x in enumerate(s):
target = listOfA if x == 'A' else listOfB
target.append(idx)

This leaves you with just one loop to execute; this will beat any two list comprehensions, at least not until the developers find a way to make list comprehensions build a list twice as fast as a loop with separate list.append() calls.

I’d pick this any day over a nested list comprehension just to be able to produce two lists on one line. As the Zen of Python states:

A nice approach to this problem is to use defaultdict. As @Martin already said, list comprehension is not the right tool to produce two lists. Using defaultdict would enable you to create segregation using a single iteration. Moreover your code would not be limited in any form.

>>> from collections import defaultdict
>>> s=['A','B','A','A','A','B','B']
>>> listOf = defaultdict(list)
>>> for idx, elem in enumerate(s):
listOf[elem].append(idx)
>>> listOf['A'], listOf['B']
([0, 2, 3, 4], [1, 5, 6])

What you’re trying to do isn’t exactly impossible, it’s just complicated, and probably wasteful.

If you want to partition an iterable into two iterables, if the source is a list or other re-usable iterable, you’re probably better off either doing it in two passes, as in your question.

Even if the source is an iterator, if the output you want is a pair of lists, not a pair of lazy iterators, either use Martijn’s answer, or do two passes over list(iterator).)

But if you really need to lazily partition an arbitrary iterable into two iterables, there’s no way to do that without some kind of intermediate storage.

Let’s say you partition [1, 2, -1, 3, 4, -2] into positives and negatives. Now you try to next(negatives). That ought to give you -1, right? But it can’t do that without consuming the 1 and the 2. Which means when you try to next(positives), you’re going to get 3 instead of 1. So, the 1 and 2 need to get stored somewhere.

Most of the cleverness you need is wrapped up inside itertools.tee. If you just make positives and negatives into two teed copies of the same iterator, then filter them both, you’re done.

In fact, this is one of the recipes in the itertools docs:

def partition(pred, iterable):
'Use a predicate to partition entries into false entries and true entries'
# partition(is_odd, range(10)) --> 0 2 4 6 8   and  1 3 5 7 9
t1, t2 = tee(iterable)
return filterfalse(pred, t1), filter(pred, t2)

(If you can’t understand that, it’s probably worth writing it out explicitly, with either two generator functions sharing an iterator and a tee via a closure, or two methods of a class sharing them via self. It should be a couple dozen lines of code that doesn’t require anything tricky.)

And you can even get partition as an import from a third-party library like more_itertools.

Now, you can use this in a one-liner:

lst = [1, 2, -1, 3, 4, -2]
positives, negatives = partition(lst, lambda x: x>=0)

… and you’ve got an iterator over all the positive values, and an iterator over all of the negative values. They look like they’re completely independent, but together they only do a single pass over lst—so it works even if you assign lst to a generator expression or a file or something instead of a list.

So, why isn’t there some kind of shortcut syntax for this? Because it would be pretty misleading.

A comprehension takes no extra storage. That’s the reason generator expressions are so great—they can transform a lazy iterator into another lazy iterator without storing anything.

But this takes O(N) storage. Imagine all of the numbers are positive, but you try to iterate negative first. What happens? All of the numbers get pushed to trueq. In fact, that O(N) could even be infinite (e.g., try it on itertools.count()).

That’s fine for something like itertools.tee, a function stuck in a module that most novices don’t even know about, and which has nice docs that can explain what it does and make the costs clear. But doing it with syntactic sugar that made it look just like a normal comprehension would be a different story.

For those who live on the edge 😉

listOfA, listOfB = [[i for i in cur_list if i is not None] for cur_list in zip(*[(idx,None) if value == 'A' else (None,idx) for idx,value in enumerate(s)])]
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.