BeautifulSoup and Amazon.co.uk

Question:

I am trying to parse amazon to compile a list of prices, as part of a bigger project relating to statistics. However, I am stumped. I was wondering If anyone can review my code and tell me where I went wrong?

#!/usr/bin/python
# -*- coding:  utf-8 -*-
import mechanize
from bs4 import BeautifulSoup

URL_00 = "http://www.amazon.co.uk/Call-Duty-Black-Ops-PS3/dp/B007WPF7FE/ref=sr_1_2?ie=UTF8&qid=1352117194&sr=8-2"

bro = mechanize.Browser()
resp = bro.open(URL_00)
html = resp.get_data()
soup_00 = BeautifulSoup(html)
price = soup_00.find('b', {'class':'priceLarge'})
print price #this should return at the very least the text enclosed in a tag

According to the screenshot, what I wrote above should work, shouldn’t it?

http://i.imgur.com/bPVe1.png (cannot post an image as a newbie..)

Well all I get in the print out is "[]", if I change the line before last to this:

 price = soup_00.find('b', {'class':'priceLarge'}).contents[0].string

or

price = soup_00.find('b', {'class':'priceLarge'}).text

I get a "noneType" error.

I am quite confused as to why this is happening. The page encoding in the URL on chrome says UTF8, to which my script is adjusted in line #2.
I have changed it to ISO (as per inner HTML of the page) but this makes zero difference, so I am positive encoding is not the issue here.

Also, don’t know if this is relevant at all, but my system locale on linux being UTF-8 should not cause a problem should it?

Asked By: NopeNopeNope

||

Answers:

There’s no need to do this as Amazon provide an API

https://affiliate-program.amazon.co.uk/gp/advertising/api/detail/main.html

The Product Advertising API helps you advertise Amazon products using product search and look up capability, product information and features such as Customer Reviews, Similar Products, Wish Lists and New and Used listings.

More detail here: Amazon API library for Python?

I’m using the API and it so much easier and reliable then scraping the data from the webpage, even with BS. You will also get access to a list of prices for new, second hand etc and not just the “headline” price.

Answered By: Paul Collingwood
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.