python map string split list

Question:

I am trying to map the str.split function to an array of string. namely, I would like to split all the strings in a string array that follow the same format. Any idea how to do that with map in python? For example let’s assume we have a list like this:

a = ['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']

want to split the strings by space ( split(" ")) using map to have a list as:

[['2011-12-22', '46:31:11'], ['2011-12-20', '20:19:17'], ['2011-12-20', '01:09:21']]
Asked By: pacodelumberg

||

Answers:

Use map in conjunction with a function. A neat way is to use a lambda function:

>>> a=['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']
>>> map(lambda s: s.split(), a)
[['2011-12-22', '46:31:11'], ['2011-12-20', '20:19:17'],
 ['2011-12-20', '01:09:21']]
Answered By: phihag
map(lambda x: x.split(), a) 

but, using a list comprehension

[x.split() for x in a] 

is much clearer in this case.

Answered By: Russell Dias

Though it isn’t well known, there is a function designed just for this purpose, operator.methodcaller:

>>> from operator import methodcaller
>>> a = ['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']
>>> list(map(methodcaller("split", " "), a))
[['2011-12-22', '46:31:11'], ['2011-12-20', '20:19:17'], ['2011-12-20', '01:09:21']]

This technique is faster than equivalent approaches using lambda expressions.

Answered By: Raymond Hettinger

Community wiki answer to compare other answers given

>>> from timeit import Timer
>>> t = {}
>>> t['methodcaller'] = Timer("map(methodcaller('split', ' '), a)", "from operator import methodcaller; a=['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']")
>>> t['lambda'] = Timer("map(lambda s: s.split(), a)", "a = ['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']")
>>> t['listcomp'] = Timer("[s.split() for s in a]", "a = ['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']")
>>> for name, timer in t.items():
...     print '%s: %.2f usec/pass' % (name, 1000000 * timer.timeit(number=100000)/100000)
... 
listcomp: 2.08 usec/pass
methodcaller: 2.87 usec/pass
lambda: 3.10 usec/pass
Answered By: kojiro

This is how I do it:

>>> a=['2011-12-22 46:31:11','2011-12-20 20:19:17', '2011-12-20 01:09:21']
>>> map(str.split, a)
[['2011-12-22', '46:31:11'], ['2011-12-20', '20:19:17'], ['2011-12-20', '01:09:21']]

This only works when you know you have a list of str (i.e. not just a list of things that implement the split method in a way compatible with str). It also relies on using the default behaviour of split(), which splits on any whitespace, rather than using x.split(' '), which splits on space characters only (i.e. not tabs, newlines, or other whitespace), because you can’t pass another argument using this method. For calling behaviour more complex than this, I would use a list comprehension.

Answered By: Ben