Intersect split string with partial words on list (possibly with regex)

Question:

I have two lists:

keywords = ['critic', 'argu', 'dog', 'cat']
splitSentences = ['Add', 'critical', 'argument', 'birds']

I need to find how many words in splitSentence begin with words of keywords. In my example, that would be 2 (for critical matching "critic" and argument matching "argu").

The problem is that doing set(keywords).intersection(splitSentences) returns 0. I tried prefixing every word in keywords with ^, but it still returns 0.

Apologies, quite new on Python. I’m working on a Jupyter notebook.

Asked By: Carrol

||

Answers:

With regex:

import re

for i in keywords:
    count = 0
    pref = '^'+ i
    for word in splitSentences:
        if re.match(pref, word):
            count += 1
    print(count)

The semi one liner:

for i in keywords:
    print(sum([1 for word in splitSentences if word.startswith(i)]))

The one liner:

print({el:sum([1 for word in splitSentences if word.startswith(el)]) for el in keywords})
Answered By: BiRD
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.