How does the star * behave in Python's regex vs. glob pattern search?
Question:
I am trying to understand how glob
works, also how is the star *
related to glob
and how it works vs. how it works in regex?
For example, I know glob.glob("C:UsersMeMyFolder*.txt")
would match and return any files with the extension .txt
in the given path but how would that compare to finding text files using Regex? Is one better in performance than the other? higher speed or less computation memory?
Answers:
If we are talking about a ‘*’ in the pattern, then, typically *
just means “match any number of characters” or better “match 0 or more characters”, so if we assume we have files in a directory: apple cherry custard green_apple
then you can get lists of files for example:
import glob
print("glob.glob('a*') -> {}".format(glob.glob('a*'))) # match starting with 'a'
print("glob.glob('*a*') -> {}".format(glob.glob('*a*'))) # match anything that contains an 'a'
print("glob.glob('apple*') -> {}".format(glob.glob('apple*'))) # match if starts with 'apple'
print("glob.glob('*apple*') -> {}".format(glob.glob('*apple*'))) # match if 'apple' is in the filename
This would return
glob.glob('a*') -> ['apple']
glob.glob('*a*') -> ['apple', 'custard', 'green_apple']
glob.glob('apple*') -> ['apple']
glob.glob('*apple*') -> ['apple', 'green_apple']
This is a very simplistic view of what you can do with glob.glob.
I am trying to understand how glob
works, also how is the star *
related to glob
and how it works vs. how it works in regex?
For example, I know glob.glob("C:UsersMeMyFolder*.txt")
would match and return any files with the extension .txt
in the given path but how would that compare to finding text files using Regex? Is one better in performance than the other? higher speed or less computation memory?
If we are talking about a ‘*’ in the pattern, then, typically *
just means “match any number of characters” or better “match 0 or more characters”, so if we assume we have files in a directory: apple cherry custard green_apple
then you can get lists of files for example:
import glob
print("glob.glob('a*') -> {}".format(glob.glob('a*'))) # match starting with 'a'
print("glob.glob('*a*') -> {}".format(glob.glob('*a*'))) # match anything that contains an 'a'
print("glob.glob('apple*') -> {}".format(glob.glob('apple*'))) # match if starts with 'apple'
print("glob.glob('*apple*') -> {}".format(glob.glob('*apple*'))) # match if 'apple' is in the filename
This would return
glob.glob('a*') -> ['apple']
glob.glob('*a*') -> ['apple', 'custard', 'green_apple']
glob.glob('apple*') -> ['apple']
glob.glob('*apple*') -> ['apple', 'green_apple']
This is a very simplistic view of what you can do with glob.glob.