Distributing jobs over multiple servers using python
Question:
I currently has an executable that when running uses all the cores on my server. I want to add another server, and have the jobs split between the two machines, but still each job using all the cores on the machine it is running. If both machines are busy I need the next job to queue until one of the two machines become free.
I thought this might be controlled by python, however I am a novice and not sure which python package would be the best for this problem.
I liked the “heapq” package for the queuing of the jobs, however it looked like it is designed for a single server use. I then looked into Ipython.parallel, but it seemed more designed for creating a separate smaller job for every core (on either one or more servers).
I saw a huge list of different options here (https://wiki.python.org/moin/ParallelProcessing) but I could do with some guidance as which way to go for a problem like this.
Can anyone suggest a package that may help with this problem, or a different way of approaching it?
Answers:
Celery does exactly what you want – make it easy to distribute a task queue across multiple (many) machines.
See the Celery tutorial to get started.
Alternatively, IPython has its own multiprocessing library built in, based on ZeroMQ; see the introduction. I have not used this before, but it looks pretty straight-forward.
You can use the pyremto package (developed by myself) to schedule jobs across multiple clients / servers:
https://pypi.org/project/pyremto/
You just set up job descriptions (as json-serializable dicts):
remote = RemoteControl(show_qr=True)
jobs = [{x: 0.25}, {x: 1.65}, ...]
remote.setup_jobs(jobs: list, max_redistribution_attempts=1)
Then you can query for next jobs and mark them as done on your clients:
remote.get_next_job()
remote.set_job_done(job_id: int, result: dict)
The details are explained in this example:
https://github.com/MatthiasKi/pyremto/tree/master/examples/JobScheduling/GeneralJobScheduling
Note that you can track the progress of the job execution on your smartphone, log values (and send push notifications if required) from your clients to the smartphone, and send commands / inputs back to your clients from the smartphone app.
I currently has an executable that when running uses all the cores on my server. I want to add another server, and have the jobs split between the two machines, but still each job using all the cores on the machine it is running. If both machines are busy I need the next job to queue until one of the two machines become free.
I thought this might be controlled by python, however I am a novice and not sure which python package would be the best for this problem.
I liked the “heapq” package for the queuing of the jobs, however it looked like it is designed for a single server use. I then looked into Ipython.parallel, but it seemed more designed for creating a separate smaller job for every core (on either one or more servers).
I saw a huge list of different options here (https://wiki.python.org/moin/ParallelProcessing) but I could do with some guidance as which way to go for a problem like this.
Can anyone suggest a package that may help with this problem, or a different way of approaching it?
Celery does exactly what you want – make it easy to distribute a task queue across multiple (many) machines.
See the Celery tutorial to get started.
Alternatively, IPython has its own multiprocessing library built in, based on ZeroMQ; see the introduction. I have not used this before, but it looks pretty straight-forward.
You can use the pyremto package (developed by myself) to schedule jobs across multiple clients / servers:
https://pypi.org/project/pyremto/
You just set up job descriptions (as json-serializable dicts):
remote = RemoteControl(show_qr=True)
jobs = [{x: 0.25}, {x: 1.65}, ...]
remote.setup_jobs(jobs: list, max_redistribution_attempts=1)
Then you can query for next jobs and mark them as done on your clients:
remote.get_next_job()
remote.set_job_done(job_id: int, result: dict)
The details are explained in this example:
https://github.com/MatthiasKi/pyremto/tree/master/examples/JobScheduling/GeneralJobScheduling
Note that you can track the progress of the job execution on your smartphone, log values (and send push notifications if required) from your clients to the smartphone, and send commands / inputs back to your clients from the smartphone app.