How to fix new unable to read URL error in python for yahoo finance
Question:
I have been using this code to extract (scrape) stock prices from Yahoo Finance for the last year, but now it produces an error. Does anyone know why this is happening and how to fix it?
# Importing necessary packages
from pandas_datareader import data as web
import datetime as dt
import matplotlib.pyplot as plt
import pandas as pd
import os
import numpy as np
# Stock selection from Yahoo Finance
stock = input("Enter stock symbol or ticket symbol (Exp. General Electric is 'GE'): ")
# Visualizing the stock over time and setting up the dataframe
start_date = (dt.datetime.now() - dt.timedelta(days=40000)).strftime("%m-%d-%Y")
df = web.DataReader(stock, data_source='yahoo', start=start_date)
#THE ERROR IS ON THIS LINE^
plt.plot(df['Close'])
plt.title('Stock Prices Over Time',fontsize=14)
plt.xlabel('Date',fontsize=14)
plt.ylabel('Mid Price',fontsize=14)
plt.show()
RemoteDataError: Unable to read URL: https://finance.yahoo.com/quote/MCD/history?period1=-1830801600&period2=1625284799&interval=1d&frequency=1d&filter=history
Response Text:
b’n n n n Yahoon n n n html {n height: 100%;n }n body {n background: #fafafc url(https://s.yimg.com/nn/img/sad-panda-201402200631.png) 50% 50%;n background-size: cover;n height: 100%;n text-align: center;n font: 300 18px "helvetica neue", helvetica, verdana, tahoma, arial, sans-serif;n }n table {n height: 100%;n width: 100%;n table-layout: fixed;n border-collapse: collapse;n border-spacing: 0;n border: none;n }n h1 {n font-size: 42px;n font-weight: 400;n color: #400090;n }n p {n color: #1A1A1A;n }n #message-1 {n font-weight: bold;n margin: 0;n }n #message-2 {n display: inline-block;n *display: inline;n zoom: 1;n max-width: 17em;n _width: 17em;n }n n n document.write(‘&test=’+encodeURIComponent(‘%’)+'” width=”0px” height=”0px”/>’);var beacon = new Image();beacon.src="//bcn.fp.yahoo.com/p?s=1197757129&t="+ne…
Answers:
I had the same problem. At some recent point pdr stopped working with Yahoo (again). AlphaVantage doesn’t carry all the stocks that Yahoo does; googlefinance package only gets current quotes as far as I can tell, not time series; the yahoo-finance package doesn’t work (or I failed to get it to work); Econdb sends back some kind of weirdly-formed dataframe (maybe this is fixable); and Quandl has a paywall on non-US stocks.
So because I’m cheap, I looked into the Yahoo CSV download functionality and came up with this, which returns a df pretty much like pdr does:
import pandas as pd
from datetime import datetime as dt
import calendar
import io
import requests
# Yahoo history csv base url
yBase = 'https://query1.finance.yahoo.com/v7/finance/download/'
yHeaders = {
'Accept': 'text/csv;charset=utf-8'
}
def getYahooDf(ticker, startDate, endDate=None): # dates in ISO format
start = dt.fromisoformat(startDate) # To datetime.datetime object
fromDate = calendar.timegm(start.utctimetuple()) # To Unix timestamp format used by Yahoo
if endDate is None:
end=dt.now()
else:
end = dt.fromisoformat(endDate)
toDate = calendar.timegm(end.utctimetuple())
params = {
'period1': str(fromDate),
'period2': str(toDate),
'interval': '1d',
'events': 'history',
'includeAdjustedClose': 'true'
}
response = requests.request("GET", yBase + ticker, headers=yHeaders, params=params)
if response.status_code < 200 or response.status_code > 299:
return None
else:
csv = io.StringIO(response.text)
df = pd.read_csv(csv, index_col='Date')
return df
Also works if you provide headers to your session data object which you then provide to the data reader (e.g. for the caching purpose)
import requests_cache
session = requests_cache.CachedSession(cache_name='cache', backend='sqlite', expire_after=expire_after)
# just add headers to your session and provide it to the reader
session.headers = { 'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0', 'Accept': 'application/json;charset=utf-8' }
data = web.DataReader(stock_names, 'yahoo', start, end, session=session)
I use this code to extract data from yahoo:
start = pd.to_datetime(['2007-01-01']).astype(int)[0]//10**9 # convert to unix timestamp.
end = pd.to_datetime(['2020-12-31']).astype(int)[0]//10**9 # convert to unix timestamp.
url = 'https://query1.finance.yahoo.com/v7/finance/download/' + stock_ticker + '?period1=' + str(start) + '&period2=' + str(end) + '&interval=1d&events=history'
df = pd.read_csv(url)
If you are using Google Colab first upgrade the libraries:
!pip install --upgrade pandas-datareader
!pip install --upgrade pandas
Hope it works! 🙂
Don’t forget to restart the workspace and re-run
pip install yfinance
import pandas_datareader as pdr
from datetime import datetime
TWTR = yf.Ticker('TWTR')
ticker = TWTR.history(period='1y')[['Open', 'High', 'Low', 'Close', 'Volume']] # return is
!pip install yfinance
import yfinance as yf
start_date = '2010-01-01'
end_date = '2022-03-04'
df = yf.download('AAPL', start=start_date, end=end_date)
print(df)
I have been using this code to extract (scrape) stock prices from Yahoo Finance for the last year, but now it produces an error. Does anyone know why this is happening and how to fix it?
# Importing necessary packages
from pandas_datareader import data as web
import datetime as dt
import matplotlib.pyplot as plt
import pandas as pd
import os
import numpy as np
# Stock selection from Yahoo Finance
stock = input("Enter stock symbol or ticket symbol (Exp. General Electric is 'GE'): ")
# Visualizing the stock over time and setting up the dataframe
start_date = (dt.datetime.now() - dt.timedelta(days=40000)).strftime("%m-%d-%Y")
df = web.DataReader(stock, data_source='yahoo', start=start_date)
#THE ERROR IS ON THIS LINE^
plt.plot(df['Close'])
plt.title('Stock Prices Over Time',fontsize=14)
plt.xlabel('Date',fontsize=14)
plt.ylabel('Mid Price',fontsize=14)
plt.show()
RemoteDataError: Unable to read URL: https://finance.yahoo.com/quote/MCD/history?period1=-1830801600&period2=1625284799&interval=1d&frequency=1d&filter=history
Response Text:
b’n n n n Yahoon n n n html {n height: 100%;n }n body {n background: #fafafc url(https://s.yimg.com/nn/img/sad-panda-201402200631.png) 50% 50%;n background-size: cover;n height: 100%;n text-align: center;n font: 300 18px "helvetica neue", helvetica, verdana, tahoma, arial, sans-serif;n }n table {n height: 100%;n width: 100%;n table-layout: fixed;n border-collapse: collapse;n border-spacing: 0;n border: none;n }n h1 {n font-size: 42px;n font-weight: 400;n color: #400090;n }n p {n color: #1A1A1A;n }n #message-1 {n font-weight: bold;n margin: 0;n }n #message-2 {n display: inline-block;n *display: inline;n zoom: 1;n max-width: 17em;n _width: 17em;n }n n n document.write(‘&test=’+encodeURIComponent(‘%’)+'” width=”0px” height=”0px”/>’);var beacon = new Image();beacon.src="//bcn.fp.yahoo.com/p?s=1197757129&t="+ne…
I had the same problem. At some recent point pdr stopped working with Yahoo (again). AlphaVantage doesn’t carry all the stocks that Yahoo does; googlefinance package only gets current quotes as far as I can tell, not time series; the yahoo-finance package doesn’t work (or I failed to get it to work); Econdb sends back some kind of weirdly-formed dataframe (maybe this is fixable); and Quandl has a paywall on non-US stocks.
So because I’m cheap, I looked into the Yahoo CSV download functionality and came up with this, which returns a df pretty much like pdr does:
import pandas as pd
from datetime import datetime as dt
import calendar
import io
import requests
# Yahoo history csv base url
yBase = 'https://query1.finance.yahoo.com/v7/finance/download/'
yHeaders = {
'Accept': 'text/csv;charset=utf-8'
}
def getYahooDf(ticker, startDate, endDate=None): # dates in ISO format
start = dt.fromisoformat(startDate) # To datetime.datetime object
fromDate = calendar.timegm(start.utctimetuple()) # To Unix timestamp format used by Yahoo
if endDate is None:
end=dt.now()
else:
end = dt.fromisoformat(endDate)
toDate = calendar.timegm(end.utctimetuple())
params = {
'period1': str(fromDate),
'period2': str(toDate),
'interval': '1d',
'events': 'history',
'includeAdjustedClose': 'true'
}
response = requests.request("GET", yBase + ticker, headers=yHeaders, params=params)
if response.status_code < 200 or response.status_code > 299:
return None
else:
csv = io.StringIO(response.text)
df = pd.read_csv(csv, index_col='Date')
return df
Also works if you provide headers to your session data object which you then provide to the data reader (e.g. for the caching purpose)
import requests_cache
session = requests_cache.CachedSession(cache_name='cache', backend='sqlite', expire_after=expire_after)
# just add headers to your session and provide it to the reader
session.headers = { 'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0', 'Accept': 'application/json;charset=utf-8' }
data = web.DataReader(stock_names, 'yahoo', start, end, session=session)
I use this code to extract data from yahoo:
start = pd.to_datetime(['2007-01-01']).astype(int)[0]//10**9 # convert to unix timestamp.
end = pd.to_datetime(['2020-12-31']).astype(int)[0]//10**9 # convert to unix timestamp.
url = 'https://query1.finance.yahoo.com/v7/finance/download/' + stock_ticker + '?period1=' + str(start) + '&period2=' + str(end) + '&interval=1d&events=history'
df = pd.read_csv(url)
If you are using Google Colab first upgrade the libraries:
!pip install --upgrade pandas-datareader
!pip install --upgrade pandas
Hope it works! 🙂
Don’t forget to restart the workspace and re-run
pip install yfinance
import pandas_datareader as pdr
from datetime import datetime
TWTR = yf.Ticker('TWTR')
ticker = TWTR.history(period='1y')[['Open', 'High', 'Low', 'Close', 'Volume']] # return is
!pip install yfinance
import yfinance as yf
start_date = '2010-01-01'
end_date = '2022-03-04'
df = yf.download('AAPL', start=start_date, end=end_date)
print(df)