Pull CME price data into Python 3.6.8
Question:
I am relatively new to Python so I apologize if this is a ‘bush league’ question.
I am trying to retrieve the WTI futures prices from this website:
https://www.cmegroup.com/trading/energy/crude-oil/west-texas-intermediate-wti-crude-oil-calendar-swap-futures_quotes_globex.html
Which libraries should I be using? How will I need to adjust the output when it is pulled from the website?
Currently operating in Python 3.6.8 with the pandas, numpy, requests, urllib3, BeautifulSoup, and json libraries. I am not exactly sure if these are the correct libraries and if they are which functions I should be using.
Here is a basic version of the code:
wtiFutC = 'https://www.cmegroup.com/trading/energy/crude-oil/west-texas-intermediate-wti-crude-oil-calendar-swap-futures_quotes_globex.html'
http = urllib3.PoolManager()
response2 = http.request('GET', wtiFutC)
print(type(response2.data)) #check the type of the data produced - bytes
print(response2.data) #prints out the data
soup2 = BeautifulSoup(response2.data.decode('utf-8'), features='html.parser')
print(type(soup2)) #check the type of the data produced - 'bs4.BeautifulSoup'
print(soup2) #prints out the BeautifulSoup version of the data
I want a way to see the ‘Last’ price for the WTI future for the whole curve. Instead I am seeing something like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--[if (gt IE 9) |!(IE)]><!-->
<html class="cmePineapple no-js" lang="en" xml_lang="en"
>
<!--<![endif]-->
Any help or direction would be greatly appreciated. Thank you so much! 🙂
Answers:
The data in that webpage is javascript generated, making it hard to extract data with packages like requests
.
If it’s not a big of a deal I suggest you look for a different source of data that uses minimal or no javascript. Then using requests
to get the webpage source and extract data from it.
You extract the data using libraries like BeautifulSoup
or re
(or even pandas
in some cases), and feed them to libraries like numpy
or pandas
if you want to analyze and do calculations on the data.
Otherwise i suggest you take a look at Selenium
for javascript support.
Use Requests-HTML. It’s a great resource if you’re already familiar with requests.
Use the endpoint the page does and parse out the column of interest (and date) from the json
import requests
r = requests.get('https://www.cmegroup.com/CmeWS/mvc/Quotes/Future/4707/G?quoteCodes=null&_=1560171518204').json()
last_quotes = [(item['expirationDate'], item['last']) for item in r['quotes']]
I am relatively new to Python so I apologize if this is a ‘bush league’ question.
I am trying to retrieve the WTI futures prices from this website:
https://www.cmegroup.com/trading/energy/crude-oil/west-texas-intermediate-wti-crude-oil-calendar-swap-futures_quotes_globex.html
Which libraries should I be using? How will I need to adjust the output when it is pulled from the website?
Currently operating in Python 3.6.8 with the pandas, numpy, requests, urllib3, BeautifulSoup, and json libraries. I am not exactly sure if these are the correct libraries and if they are which functions I should be using.
Here is a basic version of the code:
wtiFutC = 'https://www.cmegroup.com/trading/energy/crude-oil/west-texas-intermediate-wti-crude-oil-calendar-swap-futures_quotes_globex.html'
http = urllib3.PoolManager()
response2 = http.request('GET', wtiFutC)
print(type(response2.data)) #check the type of the data produced - bytes
print(response2.data) #prints out the data
soup2 = BeautifulSoup(response2.data.decode('utf-8'), features='html.parser')
print(type(soup2)) #check the type of the data produced - 'bs4.BeautifulSoup'
print(soup2) #prints out the BeautifulSoup version of the data
I want a way to see the ‘Last’ price for the WTI future for the whole curve. Instead I am seeing something like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--[if (gt IE 9) |!(IE)]><!-->
<html class="cmePineapple no-js" lang="en" xml_lang="en"
>
<!--<![endif]-->
Any help or direction would be greatly appreciated. Thank you so much! 🙂
The data in that webpage is javascript generated, making it hard to extract data with packages like requests
.
If it’s not a big of a deal I suggest you look for a different source of data that uses minimal or no javascript. Then using requests
to get the webpage source and extract data from it.
You extract the data using libraries like BeautifulSoup
or re
(or even pandas
in some cases), and feed them to libraries like numpy
or pandas
if you want to analyze and do calculations on the data.
Otherwise i suggest you take a look at Selenium
for javascript support.
Use Requests-HTML. It’s a great resource if you’re already familiar with requests.
Use the endpoint the page does and parse out the column of interest (and date) from the json
import requests
r = requests.get('https://www.cmegroup.com/CmeWS/mvc/Quotes/Future/4707/G?quoteCodes=null&_=1560171518204').json()
last_quotes = [(item['expirationDate'], item['last']) for item in r['quotes']]