Regular expression usage in glob.glob?

Question:

import glob

list = glob.glob(r'*abc*.txt') + glob.glob(r'*123*.txt') + glob.glob(r'*a1b*.txt')

for i in list:
  print i

This code works to list files in the current folder which have 'abc', '123' or 'a1b' in their names.

How would I use one glob to perform this function?

Asked By: user1561868

||

Answers:

The easiest way would be to filter the glob results yourself. Here is how to do it using a simple loop comprehension:

import glob
res = [f for f in glob.glob("*.txt") if "abc" in f or "123" in f or "a1b" in f]
for f in res:
    print f

You could also use a regexp and no glob:

import os
import re
res = [f for f in os.listdir(path) if re.search(r'(abc|123|a1b).*.txt$', f)]
for f in res:
    print f

(By the way, naming a variable list is a bad idea since list is a Python type…)

Answered By: Schnouki

Here is a ready to use way of doing this, based on the other answers. It’s not the most performance critical, but it works as described;

def reglob(path, exp, invert=False):
    """glob.glob() style searching which uses regex

    :param exp: Regex expression for filename
    :param invert: Invert match to non matching files
    """

    m = re.compile(exp)

    if invert is False:
        res = [f for f in os.listdir(path) if m.search(f)]
    else:
        res = [f for f in os.listdir(path) if not m.search(f)]

    res = map(lambda x: "%s/%s" % ( path, x, ), res)
    return res
Answered By: sleepycal
for filename in glob.iglob(path_to_directory + "*.txt"):
    if filename.find("abc") != -1 or filename.find("123") != -1 or filename.find("a1b") != -1:
        print filename
Answered By: R.Camilo

I’m surprised that no answers here used filter.

import os
import re

def glob_re(pattern, strings):
    return filter(re.compile(pattern).match, strings)

filenames = glob_re(r'.*(abc|123|a1b).*.txt', os.listdir())

This accepts any iterator that returns strings, including lists, tuples, dicts(if all keys are strings), etc. If you want to support partial matches, you could change .match to .search. Please note that this obviously returns a generator, so if you want to use the results without iterating over them, you could convert the result to a list yourself, or wrap the return statement with list(…).

Answered By: Evan
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.