Extract words from the list which meet certain conditions using python

Question:

I am trying to extract words from the list that meet certain conditions. It should read each line and if the line ends with ")" and in that line it should extract words starting from "." and " " space and end to "(".

I know I can’t use the startwith and endwith functions because there are no certain startwith words. That is why I am using re library, but still, my script is not executing.

import re
data = ["int k = b.k(parcel)",
"int k = kon(parcel)",
"int a", 
"int bds",
"obtain.appendFrom(parcel, dataPosition2, readInt2)",
"obtain desFrom(package, dataPosition2, readInt2)",
"int abd(callme)",
"int.dbd(callyou)",
"int throw new UnsupportedOperationException(you)",
"int throw new.UnsupportedOperationException(me)"]

for i in data:
    para = re.findall(r"*[ .]s(w+)s*[)]" ,i) # start from space and dot and endwith ")"
    i = i.replace(para,"function_call")
    
for i in data:
    print(i)

I want output like :

int k = b.function_call(parcel)
int k = function_call(parcel)
int a 
int bds
obtain.function_call(parcel, dataPosition2, readInt2)
obtain function_call(package, dataPosition2, readInt2)
int function_call(callme)
int.function_call(callyou)
int throw new function_call(you)
int throw new.function_call(me)

Asked By: ya xi er

||

Answers:

To extract words from a list of strings that meet certain conditions, you can use a regular expression (regex) to match the pattern of the words that you want to extract.

Here’s an example of how you could use the re library to extract words that start with a . or a space and end with a (:enter image description here

This function loops through each string in the input string_list,and uses a regex to find all the words that start with a . or a space and end with a (. It then adds these words to a list called words,which is returned at the end of the function.

Answered By: S.Rodgers

Use re.sub to use a regexp to replace a segment. You can’t use the return value of findall as an argument to str.replace in the first place, and doing i = i.replace(...) will not modify the i in the list (since strings are immutable for one).

So, here’s a version that uses a list comprehension to run a regexp replacement on all strings to result in a new list:

import re

data = [
    "int k = b.k(parcel)",
    "int k = kon(parcel)",
    "int a",
    "int bds",
    "obtain.appendFrom(parcel, dataPosition2, readInt2)",
    "obtain desFrom(package, dataPosition2, readInt2)",
    "int abd(callme)",
    "int.dbd(callyou)",
    "int throw new UnsupportedOperationException(you)",
    "int throw new.UnsupportedOperationException(me)",
]

fixed_data = [
    re.sub(r"(w+)s*(", "function_call(", i)
    for i in data
]

for i in fixed_data:
    print(i)
Answered By: AKX
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.