Remove empty strings from a list of strings

Question:

I want to remove all empty strings from a list of strings in python.

My idea looks like this:

while '' in str_list:
    str_list.remove('')

Is there any more pythonic way to do this?

Asked By: zerodx

||

Answers:

Using a list comprehension is the most Pythonic way:

>>> strings = ["first", "", "second"]
>>> [x for x in strings if x]
['first', 'second']

If the list must be modified in-place, because there are other references which must see the updated data, then use a slice assignment:

strings[:] = [x for x in strings if x]
Answered By: Ib33X

I would use filter:

str_list = filter(None, str_list)
str_list = filter(bool, str_list)
str_list = filter(len, str_list)
str_list = filter(lambda item: item, str_list)

Python 3 returns an iterator from filter, so should be wrapped in a call to list()

str_list = list(filter(None, str_list))
Answered By: livibetter

Depending on the size of your list, it may be most efficient if you use list.remove() rather than create a new list:

l = ["1", "", "3", ""]

while True:
  try:
    l.remove("")
  except ValueError:
    break

This has the advantage of not creating a new list, but the disadvantage of having to search from the beginning each time, although unlike using while '' in l as proposed above, it only requires searching once per occurrence of '' (there is certainly a way to keep the best of both methods, but it is more complicated).

Answered By: Andrew Jaffe

filter actually has a special option for this:

filter(None, sequence)

It will filter out all elements that evaluate to False. No need to use an actual callable here such as bool, len and so on.

It’s equally fast as map(bool, …)

Answered By: Ivo van der Wijk

Use filter:

newlist=filter(lambda x: len(x)>0, oldlist) 

The drawbacks of using filter as pointed out is that it is slower than alternatives; also, lambda is usually costly.

Or you can go for the simplest and the most iterative of all:

# I am assuming listtext is the original list containing (possibly) empty items
for item in listtext:
    if item:
        newlist.append(str(item))
# You can remove str() based on the content of your original list

this is the most intuitive of the methods and does it in decent time.

Answered By: Aamir Mushtaq

Instead of if x, I would use if X != ” in order to just eliminate empty strings. Like this:

str_list = [x for x in str_list if x != '']

This will preserve None data type within your list. Also, in case your list has integers and 0 is one among them, it will also be preserved.

For example,

str_list = [None, '', 0, "Hi", '', "Hello"]
[x for x in str_list if x != '']
[None, 0, "Hi", "Hello"]
Answered By: thiruvenkadam
>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']

>>> ' '.join(lstr).split()
['hello', 'world']

>>> filter(None, lstr)
['hello', ' ', 'world', ' ']

Compare time

>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
4.226747989654541
>>> timeit('filter(None, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.0278358459472656

Notice that filter(None, lstr) does not remove empty strings with a space ' ', it only prunes away '' while ' '.join(lstr).split() removes both.

To use filter() with white space strings removed, it takes a lot more time:

>>> timeit('filter(None, [l.replace(" ", "") for l in lstr])', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
18.101892948150635
Answered By: Aziz Alto

Reply from @Ib33X is awesome. If you want to remove every empty string, after stripped. you need to use the strip method too. Otherwise, it will return the empty string too if it has white spaces. Like, ” ” will be valid too for that answer. So, can be achieved by.

strings = ["first", "", "second ", " "]
[x.strip() for x in strings if x.strip()]

The answer for this will be ["first", "second"].

If you want to use filter method instead, you can do like

list(filter(lambda item: item.strip(), strings)). This is give the same result.

Answered By: ssi-anik

As reported by Aziz Alto filter(None, lstr) does not remove empty strings with a space ' ' but if you are sure lstr contains only string you can use filter(str.strip, lstr)

>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']
>>> ' '.join(lstr).split()
['hello', 'world']
>>> filter(str.strip, lstr)
['hello', 'world']

Compare time on my pc

>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.356455087661743
>>> timeit('filter(str.strip, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
5.276503801345825

The fastest solution to remove '' and empty strings with a space ' ' remains ' '.join(lstr).split().

As reported in a comment the situation is different if your strings contain spaces.

>>> lstr = ['hello', '', ' ', 'world', '    ', 'see you']
>>> lstr
['hello', '', ' ', 'world', '    ', 'see you']
>>> ' '.join(lstr).split()
['hello', 'world', 'see', 'you']
>>> filter(str.strip, lstr)
['hello', 'world', 'see you']

You can see that filter(str.strip, lstr) preserve strings with spaces on it but ' '.join(lstr).split() will split this strings.

Answered By: Paolo Melchiorre

Sum up best answers:

1. Eliminate emtpties WITHOUT stripping:

That is, all-space strings are retained:

slist = list(filter(None, slist))

PROs:

  • simplest;
  • fastest (see benchmarks below).

2. To eliminate empties after stripping …

2.a … when strings do NOT contain spaces between words:

slist = ' '.join(slist).split()

PROs:

  • small code
  • fast
    (BUT not fastest with big datasets due to memory, contrary to what @paolo-melchiorre results)

2.b … when strings contain spaces between words?

slist = list(filter(str.strip, slist))

PROs:

  • fastest;
  • understandability of the code.

Benchmarks on a 2018 machine:

## Build test-data
#
import random, string
nwords = 10000
maxlen = 30
null_ratio = 0.1
rnd = random.Random(0)                  # deterministic results
words = [' ' * rnd.randint(0, maxlen)
         if rnd.random() > (1 - null_ratio)
         else
         ''.join(random.choices(string.ascii_letters, k=rnd.randint(0, maxlen)))
         for _i in range(nwords)
        ]

## Test functions
#
def nostrip_filter(slist):
    return list(filter(None, slist))

def nostrip_comprehension(slist):
    return [s for s in slist if s]

def strip_filter(slist):
    return list(filter(str.strip, slist))

def strip_filter_map(slist): 
    return list(filter(None, map(str.strip, slist))) 

def strip_filter_comprehension(slist):  # waste memory
    return list(filter(None, [s.strip() for s in slist]))

def strip_filter_generator(slist):
    return list(filter(None, (s.strip() for s in slist)))

def strip_join_split(slist):  # words without(!) spaces
    return ' '.join(slist).split()

## Benchmarks
#
%timeit nostrip_filter(words)
142 µs ± 16.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit nostrip_comprehension(words)
263 µs ± 19.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter(words)
653 µs ± 37.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_map(words)
642 µs ± 36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_comprehension(words)
693 µs ± 42.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_filter_generator(words)
750 µs ± 28.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit strip_join_split(words)
796 µs ± 103 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Answered By: ankostis

Keep in mind that if you want to keep the white spaces within a string, you may remove them unintentionally using some approaches.
If you have this list

[‘hello world’, ‘ ‘, ”, ‘hello’]
what you may want [‘hello world’,’hello’]

first trim the list to convert any type of white space to empty string:

space_to_empty = [x.strip() for x in _text_list]

then remove empty string from them list

space_clean_list = [x for x in space_to_empty if x]
Answered By: Reihan_amn

match using a regular expression and a filter

lstr = ['hello', '', ' ', 'world', ' ']
r=re.compile('^[A-Za-z0-9]+')
results=list(filter(r.match,lstr))
print(results)
Answered By: Golden Lion

You can use something like this

test_list = [i for i in test_list if i]

where test_list is list from which you want to remove empty element.

Answered By: Aditya
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.