Scraping a table on Barchart website using python

Question:

Scraping an AJAX web page using python and requests

I used the script in above link to get a table on Barchart webite and it somehow stopped working recently with the error message {‘error’: {‘message’: ‘The payload is invalid.’, ‘code’: 400}}. I guess some of the filed names have been changed but I am pretty new to web scanning and I couldn’t figure out how to fix it. Any suggestions?

import requests

geturl=r'https://www.barchart.com/futures/quotes/CLJ19/all-futures'
apiurl=r'https://www.barchart.com/proxies/core-api/v1/quotes/get'


getheaders={

    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'en-US,en;q=0.9',
    'cache-control': 'max-age=0',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36'
    }

getpay={
    'page': 'all'
}

s=requests.Session()
r=s.get(geturl,params=getpay, headers=getheaders)



headers={
    'accept': 'application/json',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'en-US,en;q=0.9',
    'referer': 'https://www.barchart.com/futures/quotes/CLJ19/all-futures?page=all',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36',
    'x-xsrf-token': s.cookies.get_dict()['XSRF-TOKEN']

}
payload={
    'fields': 'symbol,contractSymbol,lastPrice,priceChange,openPrice,highPrice,lowPrice,previousPrice,volume,openInterest,tradeTime,symbolCode,symbolType,hasOptions',
    'list': 'futures.contractInRoot',
    'root': 'CL',
    'meta': 'field.shortName,field.type,field.description',
    'hasOptions': 'true',
    'raw': '1'

}


r=s.get(apiurl,params=payload,headers=headers)
j=r.json()
print(j)

OUT: {‘error’: {‘message’: ‘The payload is invalid.’, ‘code’: 400}}

Asked By: Tony Tang

||

Answers:

This happened with me too. This is because the website gets the table from an internal API and the cookies should be decoded to avoid this error.

Try this solution:

1- import the unquote function at the beginning of your code

from urllib.parse import unquote

2- Change this line:

'x-xsrf-token': s.cookies.get_dict()['XSRF-TOKEN']

to this:

'x-xsrf-token': unquote(unquote(s.cookies.get_dict()['XSRF-TOKEN']))
Answered By: Ahmed Sabry
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.