How to avoid duplicate processing in python API server?


Suppose a function detect_primes is expensive to call, and I would like to avoid repeated calls to it with duplicate parameters. What should I do?

Using caching does not help because the function could be called concurrently in different requests. When both of the requests see the cache as empty of a value, both will proceed to execute the expensive function.

def detect_primes(nums: List[int]) -> Dict[int, bool]:
    """ detect whether a list of numbers are prime """
@app.route('/detect', methods=['GET'])
def search():
    args = request.args
    nums = list(map(int, args.get('nums', '').split(',')))
    return detect_primes(nums)

for example, if a user requests with 13,14,15, another user requests with 15,16.
The answers are {"13": true, "14": false, "15": false} and {"15": false, "16": false}

I would like to avoid calling detect_primes with [13, 14, 15] and [15, 16]. Ideally both requests should wait for a call with [13, 14, 15, 16] (or two calls [13, 14, 15] and [16]), and return the respective results.

The choice of web framework is not important to me, you can assume it is flask or fastapi.

EDIT: not sure how the question is a duplicate of or is answered in Are global variables thread-safe in Flask? How do I share data between requests? As explained above, a cache can’t be used (be it an in-memory python cache or an external cache or db). I am happy to be proven wrong by an answer.

Asked By: kakarukeys



However, I would recommend using caching at least for a dict of values which is used before calling detect_primes to get already computed values for every input number. Access to the dict elements is fast so far dict is not huge.
Try to make access to the dict of computed values asynchronous, maybe with Redis.

something like that

shared_dict = {}
async def search():
    args = request.args
    nums = list(map(int, args.get('nums', '').split(',')))
    computed_values = []
    to_compute_values = []
    async for num in nums: 
        if await is_in_dict(num):
    #join to dicts
    return detect_primes(to_compute_values) | computed_values
Answered By: Roman

As per FastAPI’s documentation:

when you declare a path operation function with normal def instead of
async def, it is run in an external threadpool that is then awaited,
instead of being called directly (as it would block the server).

Thus, when you use def instead of async def the server processes requests concurrently.

In your case—and since you describe it as "Ideally both requests should wait for…"—you could define the search endpoint with async def. Async routes run on the main thread (the event loop) and the server processes the requests sequentially—as long as there is no await call to some coroutine (i.e., async function) inside such routes, in which case await will pass function control back to the event loop, allowing the loop to run other tasks/requests (please have a look at this answer for more details and solutions). In this way, you could use a dictionary to cache previous (already computed) numbers and use it to quickly look up for a specific number in subsequent requests. You could also limit the size of the dictionary, using a similar approach to this. Example is given below. You can test the below through OpenAPI at, or using, for instance, a URL such as

from fastapi import FastAPI, Query
from typing import List, Dict

app = FastAPI()
d = {}

def is_prime(n) -> bool:
    # check whether 'n' is prime or not

def detect_primes(nums: List[int]) -> Dict[int, bool]:
    res = {}
    for n in nums:
        if n in d:
            res[n] = d.get(n)
            print(f'{n} found in dict')
            is_n_Prime = is_prime(n)
            res[n] = is_n_Prime
            d[n] = is_n_Prime
    return res

async def search(nums: List[int] = Query(...)):
    return detect_primes(nums)

If, however, you are required to use await in your async def route (that would cause requests to be processed concurrently), you could use, for example, a Semaphore object to control the access to the dictionary, as described here. However, if you plan on having multiple workers active at the same time with each worker having its own things and variables—workers don’t share the same memory—you should rather consider using a database storage, or Key-Value stores (Caches), such as Redis (have a look at the answers here and here). Also, you may want to try using aioredlock, which allows "creating distributed locks between workers (processes)", as described here.

Answered By: Chris