check matches and match type for strings with multiple words against array of words

Question:

I have a fixed array of words and I have a set of strings for which I want to check if it contains a match against the array of words.
I also want to determine the type of match out of the four possible:

  • single word, exact match
  • multiple words, one of them exact match
  • single word, partial match
  • multiple words, partial match

I have the checks for the first 3, but struggling to get the 4th type. Also wondering if this can be done better/more pythonic/more efficient.

a = ['1234','tes','1234 abc','tes abc']
b = ['1234','testing12','test']

def match_string(a, b):
    if [a for x in b if a.lower() == x.lower()]:
        match_type = 'exact - single'
    elif [a for x in b if a.lower() in x.lower()]:
        match_type = 'partial - single'
    elif [a for x in b if x.lower() in a.lower()]:
        match_type = 'exact - multiple'
    #add check for 4th type; 'partial - multiple'
    else:
        match_type = 'no match'
        
    return match_type

for string in a:
    print(match_string(string, b))

desired output is ‘exact – single’, ‘partial – single’,’exact – multiple’,’partial – multiple’

Asked By: Chrisvdberge

||

Answers:

You don’t need to initialize loop for every condition. Firstly split first string into words (str.split()). Then iterate over words and check if your static list of words contains word. If not iterate over constant list of words and check if any constant word contains word.

def match_string(x, y):
    w = x.split()
    for i in w:
        if i in y:
            if len(w) > 1:
                return "exact - multiple"
            else:
                return "exact - single"
        else:
            for j in y:
                if i in j:
                    if len(w) > 1:
                        return "partial - multiple"
                    else:
                        return "partial - single"
    return "no match"

Usage:

a = "1234", "tes", "1234 abc", "tes abc", "dfdfd"
b = "1234", "testing12", "test"

for s in a:
    print(s, "|", match_string(s, b))

Output:

1234 | exact - single
tes | partial - single
1234 abc | exact - multiple
tes abc | partial - multiple
dfdfd | no match
Answered By: Olvin Roght
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.