detecting year in list of strings

Question:

I have list of strings like this:

words = ['hello', 'world', 'name', '1', '2018']

I looking for the fastest way (python 3.6) to detect year “word” in the list. For example, “2018” is year. “1” not. Let’s define the acceptable year range to 2000-2020.

Possible solution

Check if the word is number ('2018'.isdigit()) and then convert it to int and check if valid range.

What is the fastest way to do it in python?

Asked By: No1Lives4Ever

||

Answers:

Concatenate list to one string with special split char. Use regex to search.

For example:

word_tmp = " ".join(words)
re.search("b20[0-2]db", word_tmp)
Answered By: Sraw

You can build a set of your valid years (as strings). Then loop through each of the words you want to test to check if it is a valid year:

words = ['hello', 'world', 'name', '1', '2018']
valid_years = {str(x) for x in range(2000,2021)}

for word in words:
    if word in valid_years:
        print word

As Martijn Pieters mentioned in the comments, sets are the fastest solution for accessing items with an O(1) complexity:

Sets let you test for membership in O(1) time, using a list has a linear O(length_of_list) cost


EDIT:

As you can see in the comments, there are a lot of different ways of generating the set of valid_years, as long as your data structure is a Set you will have the fastest way of doing what you want.

You can read more here:

Answered By: Adam Jaamour
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.