How can I get a flat result from a list comprehension instead of a nested list?
Question:
I have a list A
, and a function f
which takes an item of A
and returns a list. I can use a list comprehension to convert everything in A
like [f(a) for a in A]
, but this returns a list of lists. Suppose my input is [a1,a2,a3]
, resulting in [[b11,b12],[b21,b22],[b31,b32]]
.
How can I get the flattened list [b11,b12,b21,b22,b31,b32]
instead? In other words, in Python, how can I get what is traditionally called flatmap
in functional programming languages, or SelectMany
in .NET?
(In the actual code, A
is a list of directories, and f
is os.listdir
. I want to build a flat list of subdirectories.)
See also: How do I make a flat list out of a list of lists? for the more general problem of flattening a list of lists after it’s been created.
Answers:
You could try itertools.chain()
, like this:
import itertools
import os
dirs = ["c:\usr", "c:\temp"]
subs = list(itertools.chain(*[os.listdir(d) for d in dirs]))
print subs
itertools.chain()
returns an iterator, hence the passing to list()
.
subs = []
map(subs.extend, (os.listdir(d) for d in dirs))
(but Ants’s answer is better; +1 for him)
You can find a good answer in the itertools
recipes:
import itertools
def flatten(list_of_lists):
return list(itertools.chain.from_iterable(list_of_lists))
Google brought me next solution:
def flatten(l):
if isinstance(l,list):
return sum(map(flatten,l))
else:
return l
You can have nested iterations in a single list comprehension:
[filename for path in dirs for filename in os.listdir(path)]
which is equivalent (at least functionally) to:
filenames = []
for path in dirs:
for filename in os.listdir(path):
filenames.append(filename)
You could just do the straightforward:
subs = []
for d in dirs:
subs.extend(os.listdir(d))
You can concatenate lists using the normal addition operator:
>>> [1, 2] + [3, 4]
[1, 2, 3, 4]
The built-in function sum
will add the numbers in a sequence and can optionally start from a specific value:
>>> sum(xrange(10), 100)
145
Combine the above to flatten a list of lists:
>>> sum([[1, 2], [3, 4]], [])
[1, 2, 3, 4]
You can now define your flatmap
:
>>> def flatmap(f, seq):
... return sum([f(s) for s in seq], [])
...
>>> flatmap(range, [1,2,3])
[0, 0, 1, 0, 1, 2]
Edit: I just saw the critique in the comments for another answer and I guess it is correct that Python will needlessly build and garbage collect lots of smaller lists with this solution. So the best thing that can be said about it is that it is very simple and concise if you’re used to functional programming 🙂
>>> from functools import reduce # not needed on Python 2
>>> list_of_lists = [[1, 2],[3, 4, 5], [6]]
>>> reduce(list.__add__, list_of_lists)
[1, 2, 3, 4, 5, 6]
The itertools
solution is more efficient, but this feels very pythonic.
import itertools
x=[['b11','b12'],['b21','b22'],['b31']]
y=list(itertools.chain(*x))
print y
itertools will work from python2.3 and greater
The question proposed flatmap
. Some implementations are proposed but they may unnecessary creating intermediate lists. Here is one implementation that’s based on iterators.
def flatmap(func, *iterable):
return itertools.chain.from_iterable(map(func, *iterable))
In [148]: list(flatmap(os.listdir, ['c:/mfg','c:/Intel']))
Out[148]: ['SPEC.pdf', 'W7ADD64EN006.cdr', 'W7ADD64EN006.pdf', 'ExtremeGraphics', 'Logs']
In Python 2.x, use itertools.map
in place of map
.
If listA=[list1,list2,list3]
flattened_list=reduce(lambda x,y:x+y,listA)
This will do.
You can use pyxtension:
from pyxtension.streams import stream
stream([ [1,2,3], [4,5], [], [6] ]).flatMap() == range(7)
This is the most simple way to do it:
def flatMap(array):
return reduce(lambda a,b: a+b, array)
The ‘a+b’ refers to concatenation of two lists
I have a list A
, and a function f
which takes an item of A
and returns a list. I can use a list comprehension to convert everything in A
like [f(a) for a in A]
, but this returns a list of lists. Suppose my input is [a1,a2,a3]
, resulting in [[b11,b12],[b21,b22],[b31,b32]]
.
How can I get the flattened list [b11,b12,b21,b22,b31,b32]
instead? In other words, in Python, how can I get what is traditionally called flatmap
in functional programming languages, or SelectMany
in .NET?
(In the actual code, A
is a list of directories, and f
is os.listdir
. I want to build a flat list of subdirectories.)
See also: How do I make a flat list out of a list of lists? for the more general problem of flattening a list of lists after it’s been created.
You could try itertools.chain()
, like this:
import itertools
import os
dirs = ["c:\usr", "c:\temp"]
subs = list(itertools.chain(*[os.listdir(d) for d in dirs]))
print subs
itertools.chain()
returns an iterator, hence the passing to list()
.
subs = []
map(subs.extend, (os.listdir(d) for d in dirs))
(but Ants’s answer is better; +1 for him)
You can find a good answer in the itertools
recipes:
import itertools
def flatten(list_of_lists):
return list(itertools.chain.from_iterable(list_of_lists))
Google brought me next solution:
def flatten(l):
if isinstance(l,list):
return sum(map(flatten,l))
else:
return l
You can have nested iterations in a single list comprehension:
[filename for path in dirs for filename in os.listdir(path)]
which is equivalent (at least functionally) to:
filenames = []
for path in dirs:
for filename in os.listdir(path):
filenames.append(filename)
You could just do the straightforward:
subs = []
for d in dirs:
subs.extend(os.listdir(d))
You can concatenate lists using the normal addition operator:
>>> [1, 2] + [3, 4]
[1, 2, 3, 4]
The built-in function sum
will add the numbers in a sequence and can optionally start from a specific value:
>>> sum(xrange(10), 100)
145
Combine the above to flatten a list of lists:
>>> sum([[1, 2], [3, 4]], [])
[1, 2, 3, 4]
You can now define your flatmap
:
>>> def flatmap(f, seq):
... return sum([f(s) for s in seq], [])
...
>>> flatmap(range, [1,2,3])
[0, 0, 1, 0, 1, 2]
Edit: I just saw the critique in the comments for another answer and I guess it is correct that Python will needlessly build and garbage collect lots of smaller lists with this solution. So the best thing that can be said about it is that it is very simple and concise if you’re used to functional programming 🙂
>>> from functools import reduce # not needed on Python 2
>>> list_of_lists = [[1, 2],[3, 4, 5], [6]]
>>> reduce(list.__add__, list_of_lists)
[1, 2, 3, 4, 5, 6]
The itertools
solution is more efficient, but this feels very pythonic.
import itertools
x=[['b11','b12'],['b21','b22'],['b31']]
y=list(itertools.chain(*x))
print y
itertools will work from python2.3 and greater
The question proposed flatmap
. Some implementations are proposed but they may unnecessary creating intermediate lists. Here is one implementation that’s based on iterators.
def flatmap(func, *iterable):
return itertools.chain.from_iterable(map(func, *iterable))
In [148]: list(flatmap(os.listdir, ['c:/mfg','c:/Intel']))
Out[148]: ['SPEC.pdf', 'W7ADD64EN006.cdr', 'W7ADD64EN006.pdf', 'ExtremeGraphics', 'Logs']
In Python 2.x, use itertools.map
in place of map
.
If listA=[list1,list2,list3]
flattened_list=reduce(lambda x,y:x+y,listA)
This will do.
You can use pyxtension:
from pyxtension.streams import stream
stream([ [1,2,3], [4,5], [], [6] ]).flatMap() == range(7)
This is the most simple way to do it:
def flatMap(array):
return reduce(lambda a,b: a+b, array)
The ‘a+b’ refers to concatenation of two lists