Remove text between two certain characters (multiple occurrences)
Question:
I want to remove the text inside the character "-" and string "n"
(the characters as well)
For example, string = "hi.-hellon good morning"
the result I want to get is string = "hi. good morning"
and for string = "hi.-hellon good morning -axqn"
the result I want to get is string = "hi. good morning axq"
I found these examples (as a reference on how to tweak the one I want)
import re
str = "hi.)hello| good morning"
re.sub(r"(?<=)).*?(?=|)", "", str)
>>>'hi.)| good morning'
and also this one
>>> import re
>>> x = "This is a sentence. (once a day) [twice a day]"
>>> re.sub("([([]).*?([)]])", "g<1>g<2>", x)
'This is a sentence. () []'
and this one
>>> import re
>>> x = "This is a sentence. (once a day) [twice a day]"
>>> re.sub("[([].*?[)]]", "", x)
'This is a sentence. '
But I still can’t get the syntax for my case. I want to learn the general syntax of this as well (i.e., customization).
Answers:
This works when you want to delete the text between one pair e.g. (-,n). When the problem is to delete text between several different pairs then I have to look better into the function how it really works.
import re
str = "hi.-hellon good morning and a good-long n day"
re.sub(r"-.*n", "", str)
>>> hi. good morning and a good day
Edit: I have found out the trick for several symbol pairs:
str = "hi.-hellon good morning and a good-long n day (delete this), bye"
strt =re.sub(r"[(-].*?[n)]", "", str)
print(strt)
>>> hi. good morning and a good day , bye
For several pairs put all into the brackets [<remove from>].*?[<remove to>]
. Then each symbol that you want to remove has the form <symbol to remove (start or end)>
. In this example -
, n
(or (n)
).
I want to remove the text inside the character "-" and string "n"
(the characters as well)
For example, string = "hi.-hellon good morning"
the result I want to get is string = "hi. good morning"
and for string = "hi.-hellon good morning -axqn"
the result I want to get is string = "hi. good morning axq"
I found these examples (as a reference on how to tweak the one I want)
import re
str = "hi.)hello| good morning"
re.sub(r"(?<=)).*?(?=|)", "", str)
>>>'hi.)| good morning'
and also this one
>>> import re
>>> x = "This is a sentence. (once a day) [twice a day]"
>>> re.sub("([([]).*?([)]])", "g<1>g<2>", x)
'This is a sentence. () []'
and this one
>>> import re
>>> x = "This is a sentence. (once a day) [twice a day]"
>>> re.sub("[([].*?[)]]", "", x)
'This is a sentence. '
But I still can’t get the syntax for my case. I want to learn the general syntax of this as well (i.e., customization).
This works when you want to delete the text between one pair e.g. (-,n). When the problem is to delete text between several different pairs then I have to look better into the function how it really works.
import re
str = "hi.-hellon good morning and a good-long n day"
re.sub(r"-.*n", "", str)
>>> hi. good morning and a good day
Edit: I have found out the trick for several symbol pairs:
str = "hi.-hellon good morning and a good-long n day (delete this), bye"
strt =re.sub(r"[(-].*?[n)]", "", str)
print(strt)
>>> hi. good morning and a good day , bye
For several pairs put all into the brackets [<remove from>].*?[<remove to>]
. Then each symbol that you want to remove has the form <symbol to remove (start or end)>
. In this example -
, n
(or (n)
).