Python code scraping data from kickstarter does not work after some iteration
Question:
I try to scrape data from kickstarter, the code is working but it gives the following error in page 15 (you might get get error in different page since webpage is dynamic):
Traceback (most recent call last): File "C:Userslenovokick.py",
line 30, in
csvwriter.writerow(row) File "C:UserslenovoAppDataLocalProgramsPythonPython37libencodingscp1252.py",
line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: ‘charmap’ codec can’t encode character ‘uff5c’ in
position 27: character maps to
What might be the issue? Any suggestion?
from urllib.request import urlopen
from bs4 import BeautifulSoup
import json
import csv
KICKSTARTER_SEARCH_URL = "https://www.kickstarter.com/discover/advanced?category_id=16&sort=newest&seed=2502593&page={}"
DATA_FILE = "kickstarter.csv"
csvfile = open(DATA_FILE, 'w')
csvwriter = csv.writer(csvfile, delimiter=',')
page_start = 0
while True:
url = KICKSTARTER_SEARCH_URL.format(page_start)
print(url)
response = urlopen(url)
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
project_details_divs = soup.findAll('div', {"class":"js-react-proj-card"})
if len(project_details_divs) == 0:
break;
for div in project_details_divs:
project = json.loads(div['data-project'])
row = [project["id"],project["name"],project["goal"],project["pledged"]]
csvwriter.writerow(row)
page_start +=1
csvfile.close()
Answers:
Add the argument encoding
to your file-opener. I mean, change
csvfile = open(DATA_FILE, 'w')
into
csvfile = open(DATA_FILE, 'w', encoding='utf-8')
But the practice on that matter is rather to use a context manager
with open(DATA_FILE, 'w', encoding='utf-8') as csvfile:
# ...
I try to scrape data from kickstarter, the code is working but it gives the following error in page 15 (you might get get error in different page since webpage is dynamic):
Traceback (most recent call last): File "C:Userslenovokick.py",
line 30, in
csvwriter.writerow(row) File "C:UserslenovoAppDataLocalProgramsPythonPython37libencodingscp1252.py",
line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: ‘charmap’ codec can’t encode character ‘uff5c’ in
position 27: character maps to
What might be the issue? Any suggestion?
from urllib.request import urlopen
from bs4 import BeautifulSoup
import json
import csv
KICKSTARTER_SEARCH_URL = "https://www.kickstarter.com/discover/advanced?category_id=16&sort=newest&seed=2502593&page={}"
DATA_FILE = "kickstarter.csv"
csvfile = open(DATA_FILE, 'w')
csvwriter = csv.writer(csvfile, delimiter=',')
page_start = 0
while True:
url = KICKSTARTER_SEARCH_URL.format(page_start)
print(url)
response = urlopen(url)
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
project_details_divs = soup.findAll('div', {"class":"js-react-proj-card"})
if len(project_details_divs) == 0:
break;
for div in project_details_divs:
project = json.loads(div['data-project'])
row = [project["id"],project["name"],project["goal"],project["pledged"]]
csvwriter.writerow(row)
page_start +=1
csvfile.close()
Add the argument encoding
to your file-opener. I mean, change
csvfile = open(DATA_FILE, 'w')
into
csvfile = open(DATA_FILE, 'w', encoding='utf-8')
But the practice on that matter is rather to use a context manager
with open(DATA_FILE, 'w', encoding='utf-8') as csvfile:
# ...