Remove part of string within a loop in python
Question:
Keep in mind this in within a loop.
How can I remove everything from "?" and so on?
So that "something_else_1" gets deleted
Url_before = "https:www.something.com?something_else_1"
Url_wanted = "https:www.something.com?"
In practice it looks kinda like this:
find_href = driver.find_elements(By.CSS_SELECTOR, 'img.MosaicAsset-module__thumb___yvFP5')
with open("URLS/text_urls.txt", "a+") as textFile:
for my_href in find_href:
textFile.write(str(my_href.get_attribute("src"))+"#do_something_to_remove_part_after_?_in_find_href"+"n")
Answers:
Use re
:
import re
Url_before = "https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?k=20&m=503337620&s=612x612&w=0&h=3G6G_9rzGuNYLOm9EG4yiZkGWNWS7yadVoAen2N80IQ="
re.sub('\?.+', '', Url_before) + "?"
'https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?'
Alternatively you could split the string on ?
and keep the first part:
Url_before.split("?")[0] + "?" # again adding the question mark
'https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?'
EDIT: Added + "?"
because I realised you wanted to keep it.
Provided there’s only one instance of "?" in the string and you want to remove everything after it, you could find the index of this character with
i = Url_before.index("?")
and then remove everything after it:
Url_wanted = Url_before[:i+1]
Keep in mind this in within a loop.
How can I remove everything from "?" and so on?
So that "something_else_1" gets deleted
Url_before = "https:www.something.com?something_else_1"
Url_wanted = "https:www.something.com?"
In practice it looks kinda like this:
find_href = driver.find_elements(By.CSS_SELECTOR, 'img.MosaicAsset-module__thumb___yvFP5')
with open("URLS/text_urls.txt", "a+") as textFile:
for my_href in find_href:
textFile.write(str(my_href.get_attribute("src"))+"#do_something_to_remove_part_after_?_in_find_href"+"n")
Use re
:
import re
Url_before = "https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?k=20&m=503337620&s=612x612&w=0&h=3G6G_9rzGuNYLOm9EG4yiZkGWNWS7yadVoAen2N80IQ="
re.sub('\?.+', '', Url_before) + "?"
'https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?'
Alternatively you could split the string on ?
and keep the first part:
Url_before.split("?")[0] + "?" # again adding the question mark
'https://media.gettyimages.com/photos/grilled-halibut-with-spinach-leeks-and-pine-nuts-picture-id503337620?'
EDIT: Added + "?"
because I realised you wanted to keep it.
Provided there’s only one instance of "?" in the string and you want to remove everything after it, you could find the index of this character with
i = Url_before.index("?")
and then remove everything after it:
Url_wanted = Url_before[:i+1]