Python Beautiful Soup I Want to Go Inside of A Tag Element

Question

while True:
            print(url)
            response = requests.get(url, headers=headers)
            # print(response.status_code)
            soup = BeautifulSoup(response.content, 'html.parser')
            footer = soup.select_one('li.page-item.nb.active')
            print(footer.text.strip())
            for tags in soup.find_all('h6'):
                print(tags)
                # tags = soup.select_one('h6>a') <<<<<<<<<<< This part i want to go inside of h6 element click it and get data from there
            next_page = soup.select_one('li.page-item.next>a')
            if next_page:
                next_url = next_page.get('href')
                url = urljoin(url, next_url)
            else:
                break

Hi Guys, I want to extract data from current page, going to clickable page which is the h6 tag. and loop again. I cannot figure out how can I solve the issue with for loops. please help thank you. i already updated the code

Asked By: New DigitalCreatives

||

Source

Answer 1

From the url you provided, taking the first as an example,

Notice there /people/232-lee-min-ho is a sublink.

All you got to do is scrape the sublink and add it to the main link as shown below,

new_link = https://mydramalist.com + sublink

it should give you the full link https://mydramalist.com/people/232-lee-min-ho

Now perform another requests.get(new_link) on your new link to retrieve the contents.

Example code:

import requests
from bs4 import BeautifulSoup

url = 'https://mydramalist.com/search?adv=people&na=3&so=popular&page=1'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
for links in soup.find_all("h6", {"class": "text-primary title"}):
    sublink = links.find("a").get("href")
    print(sublink)
    new_link = "https://mydramalist.com" + sublink
    response2 = requests.get(new_link)
    soup2 = BeautifulSoup(response2.content, 'html.parser')
    soup.find()....
    ....
    ....
    #do all your searches here

Gives you:

You should be able to get the rest.

Answered By: Sin Han Jinn

Python Beautiful Soup I Want to Go Inside of A Tag Element

Question:

Answers: