Queue.Queue vs. collections.deque

Question:

I need a queue which multiple threads can put stuff into, and multiple threads may read from.

Python has at least two queue classes, Queue.Queue and collections.deque, with the former seemingly using the latter internally. Both claim to be thread-safe in the documentation.

However, the Queue docs also state:

collections.deque is an alternative
implementation of unbounded queues
with fast atomic append() and
popleft() operations that do not
require locking.

Which I guess I don’t quite unterstand: Does this mean deque isn’t fully thread-safe after all?

If it is, I may not fully understand the difference between the two classes. I can see that Queue adds blocking functionality. On the other hand, it loses some deque features like support for the in-operator.

Accessing the internal deque object directly, is

x in Queue().deque

thread-safe?

Also, why does Queue employ a mutex for it’s operations when deque is thread-safe already?

Asked By: miracle2k

||

Answers:

deque is thread-safe. “operations that do not require locking” means that you don’t have to do the locking yourself, the deque takes care of it.

Taking a look at the Queue source, the internal deque is called self.queue and uses a mutex for accessors and mutations, so Queue().queue is not thread-safe to use.

If you’re looking for an “in” operator, then a deque or queue is possibly not the most appropriate data structure for your problem.

Answered By: brian-brazil

Queue.Queue and collections.deque serve different purposes. Queue.Queue is intended for allowing different threads to communicate using queued messages/data, whereas collections.deque is simply intended as a datastructure. That’s why Queue.Queue has methods like put_nowait(), get_nowait(), and join(), whereas collections.deque doesn’t. Queue.Queue isn’t intended to be used as a collection, which is why it lacks the likes of the in operator.

It boils down to this: if you have multiple threads and you want them to be able to communicate without the need for locks, you’re looking for Queue.Queue; if you just want a queue or a double-ended queue as a datastructure, use collections.deque.

Finally, accessing and manipulating the internal deque of a Queue.Queue is playing with fire – you really don’t want to be doing that.

Answered By: Keith Gaughan

If all you’re looking for is a thread-safe way to transfer objects between threads, then both would work (both for FIFO and LIFO). For FIFO:

Note:

  • Other operations on deque might not be thread safe, I’m not sure.
  • deque does not block on pop() or popleft() so you can’t base your consumer thread flow on blocking till a new item arrives.

However, it seems that deque has a significant efficiency advantage. Here are some benchmark results in seconds using CPython 2.7.3 for inserting and removing 100k items

deque 0.0747888759791
Queue 1.60079066852

Here’s the benchmark code:

import time
import Queue
import collections

q = collections.deque()
t0 = time.clock()
for i in xrange(100000):
    q.append(1)
for i in xrange(100000):
    q.popleft()
print 'deque', time.clock() - t0

q = Queue.Queue(200000)
t0 = time.clock()
for i in xrange(100000):
    q.put(1)
for i in xrange(100000):
    q.get()
print 'Queue', time.clock() - t0
Answered By: Jonathan Livni

(seems I don’t have reputation to comment…)
You need to be careful which methods of the deque you use from different threads.

deque.get() appears to be threadsafe, but I have found that doing

for item in a_deque:
   process(item)

can fail if another thread is adding items at the same time.
I got an RuntimeException that complained “deque mutated during iteration”.

Check collectionsmodule.c to see which operations are affected by this

For information there is a Python ticket referenced for deque thread-safety (https://bugs.python.org/issue15329).
Title “clarify which deque methods are thread-safe”

Bottom line here: https://bugs.python.org/issue15329#msg199368

The deque’s append(), appendleft(), pop(), popleft(), and len(d)
operations are thread-safe in CPython. The append methods have a
DECREF at the end (for cases where maxlen has been set), but this
happens after all of the structure updates have been made and the
invariants have been restored, so it is okay to treat these operations
as atomic.

Anyway, if you are not 100% sure and you prefer reliability over performance, just put a like Lock 😉

Answered By: BadWolf

All single-element methods on deque are atomic and thread-safe. All other methods are thread-safe too. Things like len(dq), dq[4] yield momentary correct values. But think e.g. about dq.extend(mylist): you don’t get a guarantee that all elements in mylist are filed in a row when other threads also append elements on the same side – but thats usually not a requirement in inter-thread communication and for the questioned task.

So a deque is ~20x faster than Queue (which uses a deque under the hood) and unless you don’t need the “comfortable” synchronization API (blocking / timeout), the strict maxsize obeyance or the “Override these methods (_put, _get, ..) to implement other queue organizations” sub-classing behavior, or when you take care of such things yourself, then a bare deque is a good and efficient deal for high-speed inter-thread communication.

In fact the heavy usage of an extra mutex and extra method ._get() etc. method calls in Queue.py is due to backwards compatibility constraints, past over-design and lack of care for providing an efficient solution for this important speed bottleneck issue in inter-thread communication. A list was used in older Python versions – but even list.append()/.pop(0) was & is atomic and threadsafe …

Answered By: kxr

Adding notify_all() to each deque append and popleft results in far worse results for deque than the 20x improvement achieved by default deque behavior:

deque + notify_all: 0.469802
Queue:              0.667279

@Jonathan modify his code a little and I get the benchmark using cPython 3.6.2 and add condition in deque loop to simulate the behaviour Queue do.

import time
from queue import Queue
import threading
import collections

mutex = threading.Lock()
condition = threading.Condition(mutex)
q = collections.deque()
t0 = time.clock()
for i in range(100000):
    with condition:
        q.append(1)
        condition.notify_all()
for _ in range(100000):
    with condition:
        q.popleft()
        condition.notify_all()
print('deque', time.clock() - t0)

q = Queue(200000)
t0 = time.clock()
for _ in range(100000):
    q.put(1)
for _ in range(100000):
    q.get()
print('Queue', time.clock() - t0)

And it seems the performance limited by
this function condition.notify_all()

collections.deque is an alternative implementation of unbounded queues with fast atomic append() and popleft() operations that do not require locking.
docs Queue

Answered By: nikan1996