job queue implementation for python
Question:
Do you know/use any distributed job queue for python? Can you share links or tools
Answers:
You probably want to look at multiprocessing’s Queue. Included in Python 2.6, get it on PyPI for earlier versions of Python.
Standard library documentation: http://docs.python.org/library/multiprocessing.html
On PyPI: http://pypi.python.org/pypi/multiprocessing
In addition to multiprocessing there’s also the Celery project, if you’re using Django.
There’s also “bucker” by Sylvain Hellegouarch which you can find here:
It describes itself like this:
- bucker is a queue system that supports multiple storage for the queue (memcached, Amazon SQS for now) and is driven by XML messages sent over a TCP connections between a client and the queue server.
Look at beanstalkd
redqueue?
It’s implemented in python+tornado framework, speaks memcached protocol and is optionally persistent into log files.
Currently it is also able to behave like beanstalkd, the reserve/delete way in memcache protocol as well.
It’s a year late or whatever, but this is something I’ve hacked together to make a queue of Processes executing them only X number at a time. http://github.com/goosemo/job_queue
Pyres is a resque clone built in python. Resque is used by Github as their message queue. Both use Redis as the queue backend and provide a web-based monitoring application.
Also there is Unix ‘at’
For more info:
man at
If you think that Celery is too heavy for your needs then you might want to look at the simple distributed task queue:
I developed a tool which can be used for distributing jobs on different clients / servers:
https://pypi.org/project/pyremto/
The advantages of my package are
- You can track the job progress on your smartphone
- There is a setting for automatically re-distributing jobs, which is beneficial if a worker dies
- You can send push notifications and log values directly to your smartphone (to stay up-to-date during the job execution)
- You can even send commands back to your clients, if they need some inputs during execution
Here is an example explaining how to set up the jobs, and how to query jobs from your workers:
https://github.com/MatthiasKi/pyremto/tree/master/examples/JobScheduling/GeneralJobScheduling
(Note that there is another example showing how the package can be used for managing a distributed hyper parameter search: https://github.com/MatthiasKi/pyremto/tree/master/examples/JobScheduling/DistributedHyperparameterSearch)
The pipeline:
- You set up your jobs (which must be json-serializable dicts – each dict contains the instructions for one job executed on a worker)
- After setting up the jobs you can scan the QR code from the pyremto app (if you want to track the progress in the app, or if you want to use the app to communicate with the clients, i.e. if you want to log values to your smartphone, or send commands from your smartphone to the clients). Now you can stop the execution on the client which you used to set up the jobs (you can shut down your computer).
- Query for jobs from your workers. Then you can execute the job, and report the result back to the server.
Do you know/use any distributed job queue for python? Can you share links or tools
You probably want to look at multiprocessing’s Queue. Included in Python 2.6, get it on PyPI for earlier versions of Python.
Standard library documentation: http://docs.python.org/library/multiprocessing.html
On PyPI: http://pypi.python.org/pypi/multiprocessing
In addition to multiprocessing there’s also the Celery project, if you’re using Django.
There’s also “bucker” by Sylvain Hellegouarch which you can find here:
It describes itself like this:
- bucker is a queue system that supports multiple storage for the queue (memcached, Amazon SQS for now) and is driven by XML messages sent over a TCP connections between a client and the queue server.
Look at beanstalkd
redqueue?
It’s implemented in python+tornado framework, speaks memcached protocol and is optionally persistent into log files.
Currently it is also able to behave like beanstalkd, the reserve/delete way in memcache protocol as well.
It’s a year late or whatever, but this is something I’ve hacked together to make a queue of Processes executing them only X number at a time. http://github.com/goosemo/job_queue
Pyres is a resque clone built in python. Resque is used by Github as their message queue. Both use Redis as the queue backend and provide a web-based monitoring application.
Also there is Unix ‘at’
For more info:
man at
If you think that Celery is too heavy for your needs then you might want to look at the simple distributed task queue:
I developed a tool which can be used for distributing jobs on different clients / servers:
https://pypi.org/project/pyremto/
The advantages of my package are
- You can track the job progress on your smartphone
- There is a setting for automatically re-distributing jobs, which is beneficial if a worker dies
- You can send push notifications and log values directly to your smartphone (to stay up-to-date during the job execution)
- You can even send commands back to your clients, if they need some inputs during execution
Here is an example explaining how to set up the jobs, and how to query jobs from your workers:
https://github.com/MatthiasKi/pyremto/tree/master/examples/JobScheduling/GeneralJobScheduling
(Note that there is another example showing how the package can be used for managing a distributed hyper parameter search: https://github.com/MatthiasKi/pyremto/tree/master/examples/JobScheduling/DistributedHyperparameterSearch)
The pipeline:
- You set up your jobs (which must be json-serializable dicts – each dict contains the instructions for one job executed on a worker)
- After setting up the jobs you can scan the QR code from the pyremto app (if you want to track the progress in the app, or if you want to use the app to communicate with the clients, i.e. if you want to log values to your smartphone, or send commands from your smartphone to the clients). Now you can stop the execution on the client which you used to set up the jobs (you can shut down your computer).
- Query for jobs from your workers. Then you can execute the job, and report the result back to the server.