How to use two variables with one function in python?

Question:

I want to create new column in a df that shows two options when executing a function.

I have two lists:

lista = [A, B, C, D]
listb = [would not, use to, manage to, when, did not]

I want to find the first word that can appear from lista and return it in a new column called "Effect". If this is not found, then search for values from listband print the first encountered from listb along with it next 2 strings.

Example:

enter image description here

I have tried something like this:

def matcher(Description):
    for i in lista:
        if i in Description:
            return i
    return "Not found"

def matcher(Description):
    for j in listb:
        if j in Description:
            return j + 1
    return "Not found"

df["Effect"] = df.apply(lambda i: matcher(i["Description"]), axis=1)
df["Effect"] = df.apply(lambda j: matcher(j["Description"]), axis=1)
Asked By: Victor Leon

||

Answers:

You can do both at once:

def matcher(Description):
    w = [i for i in lista if i in Description]
    w.extend( [i for i in listb if i in Description] )
    if not w:
        return "Not found"
    else:
        return ' '.join(w)

df["Effect"] = df.apply(lambda i: matcher(i["Description"]), axis=1)
Answered By: Tim Roberts

The code below should do what you want to achieve:

def matcher(sentence):
    match_list = [substr for substr in lista 
                      if substr in [ word 
                                 for word in sentence.replace(',',' ').split(" ")]]
    if match_list: # list with items evaluates to True, empty list to False
        return match_list[0]
    match_list = [substr for substr in listb if ' '+substr+' ' in sentence]
    if match_list:
        substr = match_list[0]
        return substr + " " + sentence.split(substr)[-1].replace(',',' ').strip().split(" ")[0]
    return "Not found"

df["Effect"] = df.Description.apply(matcher)

If the sentences come with more than a ‘,’ in them consider to use regular expression replacement instead of .replace(',',' ') of all non-letter characters in the sentence with a space (so that words stay guaranteed separated) and be aware of the fact that some unusual cases of substrings and sentences can have unexpected side-effects.

UPDATE providing code for adding any number of words after substring matched from listb (requested in the comments) along with explanations how the code works:

lista = ['A', 'B', 'C', 'D']
listb = ["unable to", "would not", "was not", "did not", "there is not", "could not", "failed to", "use to", "manage to", "when"]
# ^-- listb extendend with phrases from another question on same subject

# I want the following, for example, there is the following text: 
sentence1 = "During procedure it was noted that A, was present and were notified to deparment."
#  In the above text exists A and it will be returned in a new column, only the A value.
sentence2 = "During procedure it was noted that product did not inject as expected."
#  In the above text I want to found "did not" and print these text 
# along with it next N strings ("did not inject" for N=1 and "did not inject as" for N-2

def matcher(sentence, no_words=1):
    # First find a match from lista: 
    match_list = [substr for substr in lista 
                      if substr in [ word 
                                 for word in sentence.replace(',',' ').split(" ")]]
    if match_list: # list with items evaluates to True, empty list to False
        return match_list[0] # if match found in lista exit function with return

    # There was no match from lista so find a match from listb:
    match_list = [substr for substr in listb if ' '+substr+' ' in sentence]
    if match_list:
        substr = match_list[0]
        # The code for returning the substr along with additional words from the sentence
        # splits the sentence on substr 'sentence.split(substr)' and gets the sentence text
        # after the substring by taking the end element of the list created by splitting
        # using the list index [-1] ( [1] would do it too ): sentence.split(substr)[-1]. 
        # .replace(',',' ') handles the case of words separated by ',' instead of ' '. 
        # .strip() handles the case of whitespaces at start and end of the part of 
        # extracted sentence. 
        # .split(" ") creates a list of words after substr in the sentence and the slice 
        # [0:no_words] takes 'no_words' amount of words from this list to join the words
        # to one string using ' '.join() in order to add it to substr:  
        return substr + " " + ' '.join(sentence.split(substr)[-1].replace(',',' ').strip().split(" ")[0:no_words])

    # There was no match from lista and list b (no value was yet returned)  so: 
    return "Not found"

print(matcher(sentence1))
print(matcher(sentence2)) # no_words=1 is default
print(matcher(sentence2, 2))

The code above outputs:

A
did not inject
did not inject as
Answered By: Claudio
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.