Regular Expression to remove a specific word which is not followed by a space
Question:
WFH
STOPPED AT 2023 02 21 13 08 31
DURATION 01 50 56
NUMBER OF PARTICIPANTS 1
PARTICIPANTS
I have this problem statement where I want to remove the PARTICIPANTS which is on new line but doing so it removes the other PARTICIPANTS i.e.’NUMBER OF PARTICIPANTS 1′ this one. I don’t want that.
I have tried many regex but failed to find the desired the output. Either both of them get removed or none.
Help is really appreciated.
Thankyou in advance
Answers:
Using a regex negative lookahead, you can specify that there is no space ahead.
import re
#The (n)? lets it match and remove the n newline marker behind the detected word if it is present, remove that if you want to keep the newline.
regex = r"(n)?PARTICIPANTS(?!s)"
string = "WFHnSTOPPED AT 2023 02 21 13 08 31nDURATION 01 50 56nNUMBER OF PARTICPANTS 1nPARTICIPANTS"
new_string = re.sub(regex, "", string)
#result of new_string
'WFHnSTOPPED AT 2023 02 21 13 08 31nDURATION 01 50 56nNUMBER OF PARTICPANTS 1'
You can use the multiline flag so that the ^
and $
anchors match at the start and end of each line. This way, you can replace the string "PARTICIPANTS" only when it’s on its own line.
re.sub(r"(?m)^PARTICIPANTS$", "", your_str)
WFH
STOPPED AT 2023 02 21 13 08 31
DURATION 01 50 56
NUMBER OF PARTICIPANTS 1
PARTICIPANTS
I have this problem statement where I want to remove the PARTICIPANTS which is on new line but doing so it removes the other PARTICIPANTS i.e.’NUMBER OF PARTICIPANTS 1′ this one. I don’t want that.
I have tried many regex but failed to find the desired the output. Either both of them get removed or none.
Help is really appreciated.
Thankyou in advance
Using a regex negative lookahead, you can specify that there is no space ahead.
import re
#The (n)? lets it match and remove the n newline marker behind the detected word if it is present, remove that if you want to keep the newline.
regex = r"(n)?PARTICIPANTS(?!s)"
string = "WFHnSTOPPED AT 2023 02 21 13 08 31nDURATION 01 50 56nNUMBER OF PARTICPANTS 1nPARTICIPANTS"
new_string = re.sub(regex, "", string)
#result of new_string
'WFHnSTOPPED AT 2023 02 21 13 08 31nDURATION 01 50 56nNUMBER OF PARTICPANTS 1'
You can use the multiline flag so that the ^
and $
anchors match at the start and end of each line. This way, you can replace the string "PARTICIPANTS" only when it’s on its own line.
re.sub(r"(?m)^PARTICIPANTS$", "", your_str)