How to repeat just a certain part of the function in python if a condition is met?

Question:

I am writing a web scraping script that does different things depending on what was scrapped from the website.

So far everything works. But sometimes the website loads a bit slower randomly and I would get the text "Loading" in my result.

If that happens, I want it to wait a few seconds and scrap again, then use the new results to do things under the else: part of the function.

Problem is I have no idea how to make it work. I Googled a bit and it seems using while loop is the solution. But I cannot figure out how to implement it in my code (because the action I want it to repeat is inside the very same function?). Or is there a better way to do it?

Here is my code:

def Webscraping(page_url):
    #Webscraping the URL from user input

def MakeTableFromData(input):
    #input some data, print out a table made from data

def Work(page_url):
    scrapped_text = Webscraping(page_url)
    scrapped_string= ''.join(scrapped_text)
    newlistings1 = scrapped_string.split('n')
#turn the scrapped text into a list

    with open(r'C:UsersUserDocumentsNEW_LISTINGS1.json', 'w',encoding='UTF-8') as f:
      json.dump(newlistings1, f, ensure_ascii = False)

    if "Sorry but you need to complete the captcha test to continue" in newlistings1:
        print("Captcha Test")
        exit()
    
    elif "No match" in newlistings1:
        print("No listing currently")

    elif "Loading" in newlistings1:
        time.sleep(3)
    #I don't know how to write this part
    #Wait a bit and repeat the whole process of web scraping, then do everthing underneath else:
    
    else:    
        # Do things with data

            MakeTableFromData(newlistings1)
        
            with open(r'C:UsersUserDocumentsold_listing1.json', 'w',encoding='UTF-8') as f:
              json.dump(newest1, f, ensure_ascii = False)
            print("Updated old_listing1.json")

page_url = input("Enter the link.")
Work(page_url)
Asked By: RonaLightfoot

||

Answers:

A simple way would be to do this recursively, in order to make sure that the ‘scrapping’ logic is always the same. To do so, you’d just have to call again your function Work under the condition :

 elif "Loading" in newlistings1:
        time.sleep(3)
        Work(page_url)

However, there is an associated risk : what if the webpage is broken and you always end in up in "Loading" case ? Then you’d have an infinite loop. To solve this there are many different ways. You could add a parameter count = 0 to the function work: def Work(page_url, count=0). Then, the first part of the body should be checking that countis under a threshold (say, 3) and if not return false (to break the infinite call loop). Then, you condition could call the function with Work(page_url, count + 1).

def Work(page_url, count=0):
    if count > 3:
      return 
    # do your things
    elif "Loading" in newlistings1:
        time.sleep(3)
        Work(page_url, count+1)

This is just one way and probably not the best one, but it would work fine.

Answered By: Arthur Bricq
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.