Threads vs. Async

Question:

I have been reading up on the threaded model of programming versus the asynchronous model from this really good article. http://krondo.com/blog/?p=1209

However, the article mentions the following points.

An async program will simply outperform a sync program by switching between tasks whenever there is a I/O.
Threads are managed by the operating system.

I remember reading that threads are managed by the operating system by moving around TCBs between the Ready-Queue and the Waiting-Queue(amongst other queues). In this case, threads don’t waste time on waiting either do they?

In light of the above mentioned, what are the advantages of async programs over threaded programs?

Asked By: user277465

Source

Answers:

First of all, note that a lot of the detail of how threads are implemented and scheduled are very OS-specific. In general, you shouldn’t need to worry about threads waiting on each other, since the OS and the hardware will attempt to arrange for them to run efficiently, whether asynchronously on a single-processor system or in parallel on multi-processors.

Once a thread has finished waiting for something, say I/O, it can be thought of as runnable. Threads that are runnable will be scheduled for execution at some point soon. Whether this is implemented as a simple queue or something more sophisticated is, again, OS- and hardware-specific. You can think of the set of blocked threads as a set rather than as a strictly ordered queue.

Note that on a single-processor system, asynchronous programs as defined here are equivalent to threaded programs.

Answered By: Joe Kearney

It is very difficult to write code that is thread safe. With asyncronous code, you know exactly where the code will shift from one task to the next and race conditions are therefore much harder to come by.
Threads consume a fair amount of data since each thread needs to have its own stack. With async code, all the code shares the same stack and the stack is kept small due to continuously unwinding the stack between tasks.
Threads are OS structures and are therefore more memory for the platform to support. There is no such problem with asynchronous tasks.

Answered By: doron

Async I/O means there is already a thread in the driver that does the job, so you are duplicating functionality and incurring some overhead. On the other hand, often it is not documented how exactly the driver thread behaves, and in complex scenarios, when you want to control timeout/cancellation/start/stop behaviour, synchronization with other threads, it makes sense to implement your own thread. It is also sometimes easier to reason in sync terms.

Answered By: Boris Geller

see http://en.wikipedia.org/wiki/Thread_(computing)#I.2FO_and_scheduling

However, the use of blocking system calls in user threads (as opposed to kernel threads) or fibers can be problematic. If a user thread or a fiber performs a system call that blocks, the other user threads and fibers in the process are unable to run until the system call returns. A typical example of this problem is when performing I/O: most programs are written to perform I/O synchronously. When an I/O operation is initiated, a system call is made, and does not return until the I/O operation has been completed. In the intervening period, the entire process is “blocked” by the kernel and cannot run, which starves other user threads and fibers in the same process from executing.

According to this, your whole process might be blocked, and no thread will be scheduled when one thread is blocked in IO. I think this is os-specific, and will not always be hold.

Answered By: gleery

There are two ways to create threads:

synchronous threading – the parent creates one (or more) child threads and then must wait for each child to terminate. Synchronous threading is often referred to as the fork-join model.

asynchronous threading – the parent and child run concurrently/independently of one another. Multithreaded servers typically follow this model.

resource – http://www.amazon.com/Operating-System-Concepts-Abraham-Silberschatz/dp/0470128720

Answered By: aidanmelen

Assume you have 2 tasks, which does not involve any IO (on multiprocessor machine).
In this case threads outperform Async. Because Async like a
single threaded program executes your tasks in order. But threads can
execute both the tasks simultaneously.
Assume you have 2 tasks, which involve IO (on multiprocessor machine).
In this case both Async and Threads performs more or less same (performance
might vary based on number of cores, scheduling, how much process intensive
the task etc.). Also Async takes less amount of resources, low overhead and
less complex to program over multi threaded program.

How it works?
Thread 1 executes Task 1, since it is waiting for IO, it is moved to IO
waiting Queue. Similarly Thread 2 executes Task 2, since it is also involves
IO, it is moved to IO waiting Queue. As soon as it’s IO request is resolved
it is moved to ready queue so the scheduler can schedule the thread for
execution.

Async executes Task 1 and without waiting for it’s IO to complete it
continues with Task 2 then it waits for IO of both the task to complete. It
completes the tasks in the order of IO completion.

Async best suited for tasks which involve Web service calls, Database query
calls etc.,
Threads for process intensive tasks.

The below video explains aboutAsync vs Threaded model and also when to use etc.,
https://www.youtube.com/watch?v=kdzL3r-yJZY

Hope this is helpful.

Answered By: Lakshmipathi

To be fair, let’s point out the benefit of Threads under CPython GIL compared to async approach:

it’s easier first to write typical code that has one flow of events (no parallel execution) and then to run multiple copies of it in separate threads: it will keep each copy responsive, while the benefit of executing all I/O operations in parallel will be achieved automatically;
many time-proven libraries are sync and therefore easy to be included in the threaded version, and not in async one;
some sync libraries actually let GIL go at C level that allows parallel execution for tasks beyond I/O-bound ones: e.g. NumPy;
it’s harder to write async code in general: the inclusion of a heavy CPU-bound section will make concurrent tasks not responsive, or one may forget to await the result and finish execution earlier.

So if there are no immediate plans to scale your services beyond ~100 concurrent connections it may be easier to start with a threaded version and then rewrite it… using some other more performant language like Go.

Answered By: MjH