Unable to get the price of a product on Amazon when using Beautiful Soup in python

Question:

I was trying to track the price of a product using beautiful soup but whenever I try to run this code, I get a 6 digit code which I assume has something to do with recaptcha. I have tried numerous times, checked the headers, the url and the tags but nothing seems to work.

from bs4 import BeautifulSoup
import requests
from os import environ
import lxml


headers = {
    "User-Agent": environ.get("User-Agent"),
    "Accept-Language": environ.get("Accept-Language")
}

amazon_link_address = "https://www.amazon.in/Razer-Basilisk-Wired- 
Gaming-RZ01-04000100-R3M1/dp/B097F8H1MC/? 
_encoding=UTF8&pd_rd_w=6H9OF&content-id=amzn1.sym.1f592895-6b7a-4b03- 
9d72-1a40ea8fbeca&pf_rd_p=1f592895-6b7a-4b03-9d72-1a40ea8fbeca&pf_rd_r=1K6KK6W05VTADEDXYM3C&pd_rd_wg=IobLb&pd_rd_r=9fcac35b 
-b484-42bf-ba79-a6fdd803abf8&ref_=pd_gw_ci_mcx_mr_hp_atf_m"
response = requests.get(url=amazon_link_address, headers=headers)

soup = BeautifulSoup(response.content, features="lxml").prettify()

price = soup.find("a-price-whole")
print(price)
Asked By: JR. JOE

||

Answers:

The "a-price-whole" class in inside the tags so BS4 is not able to find it. This solution worked for me, I just changed your "find" to a "find_all" and made it scan through all of the spans until you find the class you are searching for then used the iterator.get_text() to print the price. Hope this helps!

soup = BeautifulSoup(response.content, features="lxml")

price = soup.find_all("span")
for i in price:
    try:
        if i['class'] == ['a-price-whole']:
            itemPrice = f"${str(i.get_text())[:-1]}"
            break
    except KeyError:
        continue

print(itemPrice)

Answered By: Ethan Chartrand
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.