How do you make faster A.P.I calls using multithreading without using requests in Python?
Question:
I’m trying to receive historical stock data for every company in the S&P 500. The problem is that it is taking a really longtime to get the data.
from ApiStuff import ApiStuff
import fundamentalanalysis as fa
import pickle
tickers = pickle.load(open('S&P500_TICKERS.dat','rb'))
api_key = ApiStuff.api_key
data_from_tickers = []
for ticker in tickers:
balance_sheet_annually = fa.balance_sheet_statement(ticker, api_key, period="annual")
data_from_tickers.append(balance_sheet_annually)
I tried searching on the internet on how to speed it up but they use other modules (i.e requests, aiohttp) for making the retrieval of the data faster and I am dependent on this module (fundamentalanalysis) to retrieve fundamental data.
Is there a way for me to still use this module and make api requests faster via the methods described?
Answers:
You certainly can do this with multiple processes; concurrent.futures
is made for this type of need. On the other hand, this is also a great learning opportunity for the use of open source. The source for fundamentalanalysis
is available on Github. The function you’re using, balance_sheet_statement
, is very straightforward and basically consists of a GET
request, a couple of data mappings, and the construction of a Pandas dataframe.
Replicating this logic using aiohttp
or requests
is going to be easier than wrangling the multiprocessing modules!
If fundamentalanalysis
supports multithreading, it can be done by replacing the for-loop with:
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=10) as e:
data_from_tickers = list(e.map(lambda t: fa.balance_sheet_statement(t, api_key, period="annual"), tickers))
The maximum number of workers can be adjusted.
I’m trying to receive historical stock data for every company in the S&P 500. The problem is that it is taking a really longtime to get the data.
from ApiStuff import ApiStuff
import fundamentalanalysis as fa
import pickle
tickers = pickle.load(open('S&P500_TICKERS.dat','rb'))
api_key = ApiStuff.api_key
data_from_tickers = []
for ticker in tickers:
balance_sheet_annually = fa.balance_sheet_statement(ticker, api_key, period="annual")
data_from_tickers.append(balance_sheet_annually)
I tried searching on the internet on how to speed it up but they use other modules (i.e requests, aiohttp) for making the retrieval of the data faster and I am dependent on this module (fundamentalanalysis) to retrieve fundamental data.
Is there a way for me to still use this module and make api requests faster via the methods described?
You certainly can do this with multiple processes; concurrent.futures
is made for this type of need. On the other hand, this is also a great learning opportunity for the use of open source. The source for fundamentalanalysis
is available on Github. The function you’re using, balance_sheet_statement
, is very straightforward and basically consists of a GET
request, a couple of data mappings, and the construction of a Pandas dataframe.
Replicating this logic using aiohttp
or requests
is going to be easier than wrangling the multiprocessing modules!
If fundamentalanalysis
supports multithreading, it can be done by replacing the for-loop with:
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=10) as e:
data_from_tickers = list(e.map(lambda t: fa.balance_sheet_statement(t, api_key, period="annual"), tickers))
The maximum number of workers can be adjusted.