Monday November 25, 2013
I’m used to read technical inaccuracies when it comes to system programming, and writings about the Nagle algorithm are very high in the list.
The problem is that when it comes to system programming, people just blindly repeat what they have read or heard (or believed they have read or heard) without understanding what it really is about. More often than not you just need to write a couple of lines and cannot invest the required time to understand the reasons behind otherwise obscure rules.
For the Nagle algorithm you can find these two assertions on the Internet:
So which myth do I want to debunk? Both of course!
The Nagle algorithm is a network optimization technique for the following problem: assume you have an interactive terminal (a telnet) connected to a remote host via TCP. Assume your application does not buffer and sends the characters as you type them. That means that if you type ten characters, it will send 10 TCP packets. Every TCP packet incurs a 40 bytes overhead because of the IP and TCP headers. That means that you will transmit 410 bytes on the network for 10 bytes of useful information, less than 3 % of efficiency.
What the Nagle algorithm does, is that it’s going to cleverly buffer your outgoing messages. Instead of sending 10 TCP packets, it may send only 1, which means only 50 bytes will be transmitted on the network, almost a tenfold increment in efficiency!
Not only will the algorithm send less bytes over the network, which is always good, but the less TCP packets you have, the less work the TCP/IP stack does, which will also increase performance!
But wait there’s more! Since you are sending only one packet instead of ten, you will not be subject to the problem of having the packets routed differently (not frequent but possible) and this might even reduce latency because nothing guarantees you that all the packets will arrive in the right order.
The algorithm is even cleverer than that, because it will buffer your packet only if the data is too small, if there are unacknowledged packets on the wire and if your packet doesn’t have the FIN flag. In other words, this will not add any latency if the remote peer can acknowledge packets fast enough or if data was reasonably large enough in the first place.
It also means that it can be safely enabled by default (that’s actually the case on most of the operating systems).
You can turn the Nagle algorithm off when you understand what it does and you verified it is negatively impacting your application.
As stated above, the impact of the Nagle algorithm on your application is generally positive or neutral. For neutral cases it could be argued that the algorithm could be turned off since it doesn’t bring anything positive, but I would retort that it’s best to leave the system’s default alone unless needed.
Turning the Nagle algorithm off means setting the TCP_NODELAY flag to true, which can be confusing!
It is often said that you should disable the Nagle algorithm if you care about latency. This statement is inaccurate. You should disable the Nagle algorithm if your application writes its messages in a way that makes buffering irrelevant. Such cases includes: application that implements its own buffering mechanism or applications that send their messages in a single system call.
Let me expand a little bit what I mean by the latter.
If your program writes its message in a single call, the Nagle algorithm will not bring anything useful. It will at best do nothing and at worst add a delay by waiting for either the packet to be large enough or the remote host to acknowledge packets in flight, if any. This can severely impact performance by adding a delay equal to the roundtrip between the two hosts, while not improving the bandwidth since your message is complete and no further data may be sent.
What if you know that your application isn’t behaving optimally?
It might not be always possible to have your messages written to the socket in one single call. In that case you may want to consider the TCPNOPUSH (TCPCORK on Linux) option. This option tells the TCP/IP stack to not send any message until the buffer is filled, the socket flushed or closed. That way, you can have several calls scattered, but having the data actually sent only when you flush the socket.
In quasardb we disabled the Nagle algorithm, but didn’t enable the TCP_NOPUSH flag.
We don’t need to because packets are written in a single system call and our protocol is designed in such a way that requests and replies always consist in a single message. In our benchmarks, we’ve noticed a dramatic loss of performance for small data exchanges, when the Nagle algorithm was active.
For the record, when using Boost.Asio, disabling the Nagle algorithm is done as such:
// assuming a boost::asio::ip::tcp::socket object named s
s. setoption (boost::asio::ip::tcp::nodelay ( true ) ) ;
Applications like quasardb are the exception, not the norm, and we took the decision to disable the algorithm only after careful consideration and validation.
I invite you to do the same for your application: make sure it is relevant and measure its impact.