zeromq: how to prevent infinite wait?

Question:

I just got started with ZMQ. I am designing an app whose workflow is:

  1. one of many clients (who have random PULL addresses) PUSH a request to a server at 5555
  2. the server is forever waiting for client PUSHes. When one comes, a worker process is spawned for that particular request. Yes, worker processes can exist concurrently.
  3. When that process completes it’s task, it PUSHes the result to the client.

I assume that the PUSH/PULL architecture is suited for this. Please correct me on this.


But how do I handle these scenarios?

  1. the client_receiver.recv() will wait for an infinite time when server fails to respond.
  2. the client may send request, but it will fail immediately after, hence a worker process will remain stuck at server_sender.send() forever.

So how do I setup something like a timeout in the PUSH/PULL model?


EDIT: Thanks user938949’s suggestions, I got a working answer and I am sharing it for posterity.

Asked By: Jesvin Jose

||

Answers:

If you are using zeromq >= 3.0, then you can set the RCVTIMEO socket option:

client_receiver.RCVTIMEO = 1000 # in milliseconds

But in general, you can use pollers:

poller = zmq.Poller()
poller.register(client_receiver, zmq.POLLIN) # POLLIN for recv, POLLOUT for send

And poller.poll() takes a timeout:

evts = poller.poll(1000) # wait *up to* one second for a message to arrive.

evts will be an empty list if there is nothing to receive.

You can poll with zmq.POLLOUT, to check if a send will succeed.

Or, to handle the case of a peer that might have failed, a:

worker.send(msg, zmq.NOBLOCK)

might suffice, which will always return immediately – raising a ZMQError(zmq.EAGAIN) if the send could not complete.

Answered By: minrk

This was a quick hack I made after I referred user938949’s answer and http://taotetek.wordpress.com/2011/02/02/python-multiprocessing-with-zeromq/ . If you do better, please post your answer, I will recommend your answer.

For those wanting lasting solutions on reliability, refer http://zguide.zeromq.org/page:all#toc64

Version 3.0 of zeromq (beta ATM) supports timeout in ZMQ_RCVTIMEO and ZMQ_SNDTIMEO. http://api.zeromq.org/3-0:zmq-setsockopt

Server

The zmq.NOBLOCK ensures that when a client does not exist, the send() does not block.

import time
import zmq
context = zmq.Context()

ventilator_send = context.socket(zmq.PUSH)
ventilator_send.bind("tcp://127.0.0.1:5557")

i=0

while True:
    i=i+1
    time.sleep(0.5)
    print ">>sending message ",i
    try:
        ventilator_send.send(repr(i),zmq.NOBLOCK)
        print "  succeed"
    except:
        print "  failed"

Client

The poller object can listen in on many recieving sockets (see the “Python Multiprocessing with ZeroMQ” linked above. I linked it only on work_receiver. In the infinite loop, the client polls with an interval of 1000ms. The socks object returns empty if no message has been recieved in that time.

import time
import zmq
context = zmq.Context()

work_receiver = context.socket(zmq.PULL)
work_receiver.connect("tcp://127.0.0.1:5557")

poller = zmq.Poller()
poller.register(work_receiver, zmq.POLLIN)

# Loop and accept messages from both channels, acting accordingly
while True:
    socks = dict(poller.poll(1000))
    if socks:
        if socks.get(work_receiver) == zmq.POLLIN:
            print "got message ",work_receiver.recv(zmq.NOBLOCK)
    else:
        print "error: message timeout"
Answered By: Jesvin Jose

The send wont block if you use ZMQ_NOBLOCK, but if you try closing the socket and context, this step would block the program from exiting..

The reason is that the socket waits for any peer so that the outgoing messages are ensured to get queued.. To close the socket immediately and flush the outgoing messages from the buffer, use ZMQ_LINGER and set it to 0..

Answered By: Adobri

If you’re only waiting for one socket, rather than create a Poller, you can do this:

if work_receiver.poll(1000, zmq.POLLIN):
    print "got message ",work_receiver.recv(zmq.NOBLOCK)
else:
    print "error: message timeout"

You can use this if your timeout changes depending on the situation, instead of setting work_receiver.RCVTIMEO.

Answered By: Mathieu Longtin
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.