Python HTTP request with headers attached generates 403 error on cloud server, running fine on my machine

Question:

To wrap up the issue I found and need help on,

  • I created a python program that calls a get request from
    https://bx.in.th/api/pairing/
  • The program works well on my machine (Mac OSX)
  • Once running on a Digital Ocean Ubuntu droplet, it throws HTTP 403
    forbidden error.
  • I did a day of research and most of the answers are to modify headers
    which I tried them all with no light of success.

Some links/references I went through.

Here is the simplified source code that points to the problem :

import urllib.request
import json

url = 'https://bx.in.th/api/pairing/'

headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
    'Accept-Encoding': 'none',
    'Accept-Language': 'en-US,en;q=0.5',
    'Connection': 'keep-alive'
}

request = urllib.request.Request(url, headers=headers)

response = urllib.request.urlopen(request)

print(response.read())
print()
print(response.getheaders())

The proper output should be :

b'{"1":{"pairing_id":1,"primary_currency":"THB","secondary_currency":"BTC"},"21":{"pairing_id":21,"primary_currency":"THB","secondary_currency":"ETH"},"22":{"pairing_id":22,"primary_currency":"THB","secondary_currency":"DAS"},"23":{"pairing_id":23,"primary_currency":"THB","secondary_currency":"REP"},"20":{"pairing_id":20,"primary_currency":"BTC","secondary_currency":"ETH"},"4":{"pairing_id":4,"primary_currency":"BTC","secondary_currency":"DOG"},"6":{"pairing_id":6,"primary_currency":"BTC","secondary_currency":"FTC"},"24":{"pairing_id":24,"primary_currency":"THB","secondary_currency":"GNO"},"13":{"pairing_id":13,"primary_currency":"BTC","secondary_currency":"HYP"},"2":{"pairing_id":2,"primary_currency":"BTC","secondary_currency":"LTC"},"3":{"pairing_id":3,"primary_currency":"BTC","secondary_currency":"NMC"},"26":{"pairing_id":26,"primary_currency":"THB","secondary_currency":"OMG"},"14":{"pairing_id":14,"primary_currency":"BTC","secondary_currency":"PND"},"5":{"pairing_id":5,"primary_currency":"BTC","secondary_currency":"PPC"},"19":{"pairing_id":19,"primary_currency":"BTC","secondary_currency":"QRK"},"15":{"pairing_id":15,"primary_currency":"BTC","secondary_currency":"XCN"},"7":{"pairing_id":7,"primary_currency":"BTC","secondary_currency":"XPM"},"17":{"pairing_id":17,"primary_currency":"BTC","secondary_currency":"XPY"},"25":{"pairing_id":25,"primary_currency":"THB","secondary_currency":"XRP"},"8":{"pairing_id":8,"primary_currency":"BTC","secondary_currency":"ZEC"}}'

[('Date', 'Sun, 13 Aug 2017 09:27:02 GMT'), ('Content-Type', 'text/javascript'), ('Content-Length', '1485'), ('Connection', 'close'), ('Set-Cookie', '__cfduid=d51c37ea835bae4a0c892e91f34f7bc131502616422; expires=Mon, 13-Aug-18 09:27:02 GMT; path=/; domain=.bx.in.th; HttpOnly'), ('Cache-Control', 'max-age=86400'), ('Expires', 'Mon, 14 Aug 2017 09:27:02 GMT'), ('Strict-Transport-Security', 'max-age=0'), ('X-Content-Type-Options', 'nosniff'), ('Server', 'cloudflare-nginx'), ('CF-RAY', '38daa2e36e0a836b-BKK')]

The error got from running the source code on the droplet :

raceback (most recent call last):
  File "api-call.py", line 17, in <module>
    response = urllib.request.urlopen(request)
  File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 472, in open
    response = meth(req, response)
  File "/usr/lib/python3.5/urllib/request.py", line 582, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.5/urllib/request.py", line 510, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

Thank you!

Answers:

You have to use strong proxy like Luminati.
I also was getting 403 error status, but it works well with luminati proxy.

Answered By: jis0324

Had a similar problem on Digital Ocean

Solution is to sign up for a proxy and use it. Note: luminiti is now brightdata.com

Example from them below.

I suggest using Python’s requests module and then setting your call like this:

import requests

proxies = {'http': 'http://brd-customer-hl_234567a0-zone-isp:[email protected]:22225',
           'https': 'http://brd-customer-hl_234567a0-zone-isp:[email protected]:22225'}
url = 'https://bx.in.th/api/pairing/'
headers = {'User-Agent': 'Mozilla/5.0 etc'}
r = requests.get(url, headers=headers, proxies=proxies, timeout=10)

r.status_code # should be 200, not 403

Use r.text or r.json() to read the api data from the request object.

Actually, you only need the https proxy for this example but it’s good practice to include them both.

Answered By: InnocentBystander