What is faster: multiple `send`s or using buffering?


I’m playing around with sockets in C/Python and I wonder what is the most efficient way to send headers from a Python dictionary to the client socket.

My ideas:

  1. use a send call for every header. Pros: No memory allocation needed. Cons: many send calls — probably error prone; error management should be rather complicated
  2. use a buffer. Pros: one send call, error checking a lot easier. Cons: Need a buffer 🙂 malloc/realloc should be rather slow and using a (too) big buffer to avoid realloc calls wastes memory.

Any tips for me? Thanks 🙂

Asked By: Jonas H.



Unless you’re sending a truly huge amount of data, you’re probably better off using one buffer. If you use a geometric progression for growing your buffer size, the number of allocations becomes an amortized constant, and the time to allocate the buffer will generally follow.

Answered By: Jerry Coffin

A send() call implies a round-trip to the kernel (the part of the OS which deals with the hardware directly). It has a unit cost of about a few hundred clock cycles. This is harmless unless you are trying to call send() millions of times.

Usually, buffering is about calling send() only once in a while, when “enough data” has been gathered. “Enough” does not mean “the whole message” but something like “enough bytes so that the unit cost of the kernel round-trip is dwarfed”. As a rule of thumb, an 8-kB buffer (8192 bytes) is traditionally considered as good.

Anyway, for all performance-related questions, nothing beats an actual measure. Try it. Most of the time, there not any actual performance problem worth worrying about.

Answered By: Thomas Pornin

Because of the way TCP congestion control works, it’s more efficient to send data all at once. TCP maintains a window of how much data it will allow to be “in the air” (sent but not yet acknowledged). TCP measures the acknowledgments coming back to figure out how much data it can have “in the air” without causing congestion (i.e., packet loss). If there isn’t enough data coming from the application to fill the window, TCP can’t make accurate measurements so it will conservatively shrink the window.

If you only have a few, small headers and your calls to send are in rapid succession, the operating system will typically buffer the data for you and send it all in one packet. In that case, TCP congestion control isn’t really an issue. However, each call to send involves a context switch from user mode to kernel mode, which incurs CPU overhead. In other words, you’re still better off buffering in your application.

There is (at least) one case where you’re better off without buffering: when your buffer is slower than the context switching overhead. If you write a complicated buffer in Python, that might very well be the case. A buffer written in CPython is going to be quite a bit slower than the finely optimized buffer in the kernel. It’s quite possible that buffering would cost you more than it buys you.

When in doubt, measure.

One word of caution though: premature optimization is the root of all evil. The difference in efficiency here is pretty small. If you haven’t already established that this is a bottleneck for your application, go with whatever makes your life easier. You can always change it later.

Answered By: Daniel Stutzbach

i’m running a linux websocket server on google compute. 2 calls to send takes my ping from 80ms all the way up to 200ms! wtf!?!?

send(socket, header);
send(socket, payload);

this is completely ridiculous!!!!! make sure you test. (header is 2 bytes, payload is 4 bytes. how 200ms is even possible i have no idea)

Answered By: Farzher
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.