Pull CME price data into Python 3.6.8


I am relatively new to Python so I apologize if this is a ‘bush league’ question.

I am trying to retrieve the WTI futures prices from this website:

Which libraries should I be using? How will I need to adjust the output when it is pulled from the website?

Currently operating in Python 3.6.8 with the pandas, numpy, requests, urllib3, BeautifulSoup, and json libraries. I am not exactly sure if these are the correct libraries and if they are which functions I should be using.

Here is a basic version of the code:

wtiFutC = 'https://www.cmegroup.com/trading/energy/crude-oil/west-texas-intermediate-wti-crude-oil-calendar-swap-futures_quotes_globex.html'
http = urllib3.PoolManager()
response2 = http.request('GET', wtiFutC)
print(type(response2.data)) #check the type of the data produced - bytes
print(response2.data) #prints out the data

soup2 = BeautifulSoup(response2.data.decode('utf-8'), features='html.parser')
print(type(soup2)) #check the type of the data produced - 'bs4.BeautifulSoup'
print(soup2) #prints out the BeautifulSoup version of the data

I want a way to see the ‘Last’ price for the WTI future for the whole curve. Instead I am seeing something like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 

<!--[if (gt IE 9) |!(IE)]><!-->
<html class="cmePineapple no-js" lang="en" xml_lang="en" 

Any help or direction would be greatly appreciated. Thank you so much! 🙂

Asked By: Unknown1984



The data in that webpage is javascript generated, making it hard to extract data with packages like requests.
If it’s not a big of a deal I suggest you look for a different source of data that uses minimal or no javascript. Then using requests to get the webpage source and extract data from it.

You extract the data using libraries like BeautifulSoup or re(or even pandas in some cases), and feed them to libraries like numpy or pandas if you want to analyze and do calculations on the data.

Otherwise i suggest you take a look at Selenium for javascript support.

Answered By: Xosrov

Use Requests-HTML. It’s a great resource if you’re already familiar with requests.

Answered By: Kamikaze_goldfish

Use the endpoint the page does and parse out the column of interest (and date) from the json

import requests

r = requests.get('https://www.cmegroup.com/CmeWS/mvc/Quotes/Future/4707/G?quoteCodes=null&_=1560171518204').json()
last_quotes = [(item['expirationDate'], item['last']) for item in r['quotes']]
Answered By: QHarr