HAProxy Technologies is proud to announce the release of HAProxy 1.9. This release brings a native HTTP representation (HTX) powering end-to-end HTTP/2 support and paving the way for future innovations such as HTTP/3 (QUIC). It also contains improvements to buffers and connection management including connection pooling to backends, threading optimizations, updates to the Runtime API, and much more.

UPDATE: HAProxy 1.9.2 Adds gRPC Support

HAProxy, the world’s fastest and most widely used software load balancer, was first released in December 2001. The load balancer landscape has changed significantly since then. Yet HAProxy, with 17 years of active development under its belt, has continued to evolve and innovate. Today, we’re announcing the release of HAProxy 1.9.

This release focuses on laying the foundation that will allow us to continue to provide best-in-class performance while accelerating cutting edge feature delivery for modern environments. Exciting near-term features on the roadmap, thanks to the core improvements in 1.9, include Layer 7 retries, circuit breaking, gRPC, the new Data Plane API, and much more.

The advancements in this version are thanks to a strong community of contributors. They provide code submissions covering new functionality and bug fixes, quality assurance test each feature, and correct documentation typos. Everyone has done their part to make this release possible!

We also saw a need to release updates more often. Going forward, HAProxy will be moving from an annual release cycle to a biannual release cycle. While previous major releases would happen each year around November/December, starting today we will begin releasing twice per year. Note that this version is backwards compatible with older configurations.

HAProxy 1.9 can be broken down into the following categories:

Buffer Improvements

Connection Management

Native HTTP Representation (HTX)

Improved Threading

Cache Improvements

Early Hints (HTTP 103)

Runtime API Improvements

Server Queue Priority Control

Random Load Balancing Algorithm

Cloud-Native Logging

New Fetches

New Converters

Miscellaneous Improvements

Regression Test Suite

In the following sections, we’ll dig into these categories and share the improvements you can expect to see.

Buffer Improvements

HAProxy already supports HTTP/2 to the client. A major goal of this release was to support end-to-end HTTP/2, including to the backend server. We also wanted to support any future version of HTTP, such as HTTP/3 (QUIC).

Our R&D team put a tremendous amount of effort into establishing what changes would be required to make this happen. A very important one that was discovered involves the way in which HAProxy handles buffers.

The buffer is an area of storage cut into two parts: input data that has not been analyzed and output data. The buffer can start and end anywhere. In previous versions of HAProxy there were 22 possible buffer cases (times two versions of the code: input and output), as the graphic below illustrates:

This shows a breakdown of the various types of buffers and how they are allocated, pre-version 1.9. A decision was made to rewrite the buffer handling and simplify buffer allocation. Below is a diagram showing the latest buffer changes:

The latest changes reduce the amount of buffer cases to seven with only one version of the code to maintain.

In addition to this refactoring, the buffer’s header, which describes the buffer state, has been split from the storage area. This means that it is not mandatory anymore to have a single representation for the same data and that multiple actors may use the same storage in a different state. This is typically used in the lower layer muxes during data transfers to avoid memory copies (“zero-copy”) by aliasing the same data block by the reader and the writer. This has resulted in a performance increase for HTTP/2.

This was not an easy task but will bring a lot of benefits. Namely, as mentioned, it paves the way for easier implementation of end-to-end HTTP/2. It also simplifies several other things including the internal API, handling of error messages, and rendering the Stats page.

Connection Management

The connection management in HAProxy 1.9 received some big improvements. The new implementation has moved from a callback-oriented model to an async events model with completion callbacks. This new design will be extremely beneficial and reduce the amount of bugs that can appear within the connection layer.

Some of the benefits of the new design include: lower send() latency (it almost never polls), fewer round-trips between layers (better I-cache efficiency), straight-forward usage within the upper layers, and eliminating code duplication and providing granular error reporting within the lower layers. It will also provide the ability to retry failed connections using a different protocol (e.g. switch between HTTP/2 and HTTP/1 if ALPN indicates support for both and a failure happens on one of them).

The http-reuse directive now defaults to safe if not set. This means that the first request of a session to a backend server is always sent over its own connection. Subsequent requests may reuse other existing, idle connections. This has been the recommended setting for several years and it was decided that it was time to make it the default.

In addition, HAProxy now provides connection pooling. Idle connections between HAProxy and the server are no longer closed immediately if the frontend connection vanishes. They remain open on the server to be used by other requests.

Native HTTP Representation (HTX)

While researching the route needed to support future versions of HTTP, it was decided that the internal handling of HTTP messages required a redesign. Previously, HTTP messages were kept as a byte stream as it appears on the wire, and the disparate tasks of analyzing that data and processing it were mixed into a single phase.

The HTTP information was collected as a stream of bytes and manipulated using offsets. The main structure had two channels: the request/response and the HTTP transaction. The first channel buffered the request and response messages as strings. The HTTP transaction channel had two states: response/request and another with all of the offsets for the headers.

With everything stored as offsets, when it came to adding, removing and rewriting HTTP data, things became quite painful, constantly requiring the movement of the end of the headers and even, possibly, the HTTP body in the case of responses. Over time, the need for header manipulation has increased with cookies, keep-alive, compression and cache, making this task expensive.

The new design, which we call HTX, creates an internal, native representation of the HTTP protocol(s). It creates a list of strongly typed, well-delineated header fields that support gaps and out-of-order fields. Modifying headers now simply consists in marking the old one deleted and appending the new one at the end.

This provides easy manipulation of any representation of the HTTP protocol, allows us to maintain HTTP transport and semantics from end-to-end, and provides higher performance when translating HTTP/2 to HTTP/1.1 or HTTP/1.1 to HTTP/2. It splits analyzing from processing so that, now, the analysis and formatting happen in the connection layer and the processing happens in the application layer.

Since we’re performing additional testing, HTX is not yet enabled by default. Enable it by using the following option in a defaults , frontend , backend or listen section:

Once turned on, you can use HTTP/2 to your backend servers. Add alpn h2 to a server line (or alpn h2,http/1.1 if you prefer to let HAProxy negotiate the protocol with the server).

Here is a full frontend + backend displaying end-to-end HTTP/2:

HAProxy 1.9 also supports the proto h2 directive which allows HAProxy to communicate using HTTP/2 without TLS, such as to HTTP/2-enabled backends like Varnish and H2O. You can enable this with the following server configuration:

Improved Threading

Significant improvements were made to the threading in 1.9. These changes allow HAProxy to offer its superior performance. To achieve this there was a rework of the task scheduler. It now divides its work into three levels:

a priority-aware level, shared between all threads

a lockless, priority-aware level; one per thread

a per-thread list of already started tasks that can be used for I/O

This results in most of the scheduling work being performed without any locks, which scales much better. Also, an optimization was made in the scheduler regarding its wait queues. They are now mostly lock free. The memory allocator became lockless and uses a per-thread cache of recently used objects that are still hot in the CPU cache, resulting in much faster structure initialization. The file descriptor event cache became mostly lockless as well, allowing much faster concurrent I/O operations. Last, the file descriptor (FD) lock has been updated so that it’s used less frequently. Overall, you should expect to see about a 60% performance gain when using HAProxy 1.9 with threading enabled.

Cache Improvements

We introduced the Small Object Cache in HAProxy 1.8. At the time, we knew it was only the beginning of a feature many have asked for: caching within the proxy layer. Internally, we referred to it as the favicon cache because it was limited to caching objects smaller than tune.bufsize , which defaults to 16KB. Also, during that first version, it could only cache objects that returned a response code of HTTP 200 OK.

We’re happy to announce that, in HAProxy 1.9, you can now cache objects up to 2GB in size, set with max-object-size . The total-max-size setting determines the total size of the cache and can be increased up to 4095MB. We’re very excited about these changes and look forward to improving the cache even further in the future!

Early Hints (HTTP 103)

HAProxy now supports HTTP Status code 103, also known as Early Hints (RFC8297), which allows you to send a list of links to objects to preload to the client before the server even starts to respond. Still in early adoption, Early Hints is looking like it may replace HTTP/2 Server Push.

A few elements make Early Hints an improvement over Server Push. They are as follows:

Server Push can accelerate the delivery of resources, but only resources for which the server is authoritative. In other words, it must follow the same-origin policy, which in some cases hinders the usage of a CDN. Early Hints can point directly to a CDN-hosted object.

Early Hints can give the browser the opportunity to use a locally-cached version of the object. Server Push requires that the request be transmitted to the origin regardless of whether the client has the response cached.

To enable the use of Early Hints you would add something similar to the following to your HAProxy configuration file:

While many browsers are still working to support this new feature, you can be sure that HAProxy will be at the forefront when it comes to providing enhancements that improve the browsing experience of your site.

Runtime API Improvements

We’ve updated the Runtime API. The first change modifies the master/worker model to support easier interaction with the workers and better observability into the processes. First, the master now has its own socket that can be used to communicate with it directly. This socket can then manage communication with each individual worker, even those that are exiting.

To begin using this new feature, HAProxy should be launched with the -W and -S options.

Then connect to the Runtime API via the master socket, like so:

The new show proc command displays the uptime of each process.

The new reload command reloads HAProxy and loads a new configuration file. It is exactly the same as sending a SIGUSR2 signal to the master process, except that it can be triggered by an external program after a new configuration file has been uploaded.

From the master socket, commands can be sent to each individual worker process by prefixing the command with an @ sign and the worker’s number. Here’s an example of how you would issue show info to the first worker process:

We’ve also added payload support, which allows you to insert multi-line values using the Runtime API. This is useful for updating map files, for example. At the moment, TLS certificate updating through the Runtime API is not supported, but stay tuned for HAProxy 2.0!

To update a map file using a payload, you would get the ID of the map that you want to update and then use add map to append new lines, separating lines with

:

You can also append the contents of a file, like so:

HAProxy can already do OCSP stapling, in which the revocation status and expiration date of a certificate is attached to the TLS certificate. This saves the browser from having to contact the certificate vendor itself to verify. The new payload support allows you to more easily update OCSP files without reloading HAProxy. First, you’d generate an .ocsp file for the certificate using the openssl ocsp command. Once you have the .ocsp file you can issue the following command, which will use the Runtime API with payload support to update within the running process:

The script below shows a complete example for automating this process:

A new show activity command has also been added to the Runtime API. It shows for each thread the total CPU time that was detected as stolen by the system, possibly in other processes running on the same processor, or by another VM shared by the same hypervisor. It also indicates the average processing latency experienced by all tasks, which may indicate that some heavy operations are in progress, such as very high usage of asymmetric cryptography, or extremely large ACLs involving thousands of regular expressions.

Similarly, CPU time and latency values can be reported in logs when profiling is enabled in the global section or enabled using the Runtime API. This helps indicate which TCP/HTTP requests cost a lot to process and which ones suffer from the other ones. To enable profiling within the global section, you would add:

Optionally, to set it using the Runtime API:

To verify that it’s been enabled:

Profiling exposes the following fetches which can be captured within the HAProxy log:

Fetch method Description date_us The microseconds part of the date. cpu_calls The number of calls to the task processing the stream or current request since it was allocated. It is reset for each new request on the same connection. cpu_ns_avg The average number of nanoseconds spent in each call to the task processing the stream or current request. cpu_ns_tot The total number of nanoseconds spent in each call to the task processing the stream or current request. lat_ns_avg The average number of nanoseconds spent between the moment the task handling the stream is woken up and the moment it is effectively called. lat_ns_tot The total number of nanoseconds between the moment the task handling the stream is woken up and the moment it is effectively called.

To use these in the logs, you would either extend the default HTTP log-format, like so:

Or, extend the default TCP log-format :

Server Queue Priority Control

HAProxy 1.9 allows you to prioritize some queued connections over others. This can be helpful to, for example, deliver JavaScript or CSS files before images. Or, you might use it to improve loading times for premium-level customers. Another way to use it is to give a lower priority to bots.

Set a higher server queue priority for JS or CSS files over images by adding an http-request set-priority-class directive that specifies the level of importance to assign to a request. In order to avoid starvation caused by a contiguous stream of high-priority requests, there is also the set-priority-offset directive which sets an upper bound to the extra wait time that certain requests should experience compared to others. When you combine this with ACL rules, you gain the flexibility to decide when and how to prioritize connections.

Lower numbers are given a higher priority. So, in this case, JavaScript and CSS files are given the utmost priority, followed by images, and then by everything else.

Random Load Balancing Algorithm

We’ve added a new random load-balancing algorithm. When used, a random number will be chosen as the key for the consistent hashing function. In this mode, server weights are respected. Dynamic weight changes take effect immediately, as do new server additions. Random load balancing is extremely powerful with large server fleets or when servers are frequently added and removed. When many load balancers are used, it lowers the risk that all of them will point to the same server, such as can happen with leastconn.

The hash-balance-factor directive can be used to further improve fairness of the load balancing by keeping the load assigned to a server close to the average, which is especially useful in situations where servers show highly variable response times.

To enable the random load-balancing algorithm, set balance to random in a backend .

We’re constantly looking to improve our load-balancing algorithms and hope to unveil even more options soon!

Cloud-Native Logging

HAProxy has had the ability to log to a syslog server. However, in microservice architectures that utilize Docker, installing syslog into your containers goes against the paradigm. Users have often asked for alternative methods for sending logs. We’ve received this request quite a bit and have spent some time planning the best way to implement it—without blocking—and we’re pleased to announce that we’ve found a solution!

When using HAProxy 1.9, you will now be able to take advantage of three new ways to send logs: send them to a file descriptor, to stdout, or to stderr. These new methods can be added using the standard log statement.

To enable logging to stdout, use the stdout parameter:

The same can be done for stderr . An alternative way to do that is to log to a file descriptor as shown:

The fd@1 parameter is an alias for stdout and fd@2 is an alias for stderr . This change also comes with two new log formats: raw (better for Docker) and short (better for systemd).

New Fetches

Fetches in HAProxy provide a source of information from either an internal state or from layers 4, 5, 6, and 7. New fetches that you can expect to see in this release include:

Fetch method Description date_us The microseconds part of the date. cpu_calls The number of calls to the task processing the stream or current request since it was allocated. It is reset for each new request on the same connection. cpu_ns_avg The average number of nanoseconds spent in each call to the task processing the stream or current request. cpu_ns_tot The total number of nanoseconds spent in each call to the task processing the stream or current request. lat_ns_avg The average number of nanoseconds spent between the moment the task handling the stream is woken up and the moment it is effectively called. lat_ns_tot The total number of nanoseconds between the moment the task handling the stream is woken up and the moment it is effectively called. srv_conn_free / be_conn_free Determine the number of available connections on server/backend. ssl_bc_is_resumed Returns true when the back connection was made over an SSL/TLS transport layer and the newly created SSL session was resumed using a cached session or a TLS ticket. fe_defbe Fetch frontend default backend name. ssl_fc_session_key / ssl_bc_session_key Return the SSL master key of the front/back connection. ssl_bc_alpn / ssl_bc_npn Provides the ALPN and the NPN for an outgoing connection. prio_class Returns the priority class of the current session for http mode or the connection for tcp mode. prio_offset Returns the priority offset of the current session for http mode or the connection for tcp mode.

New Converters

Converters allow you to transform data within HAProxy and are usually followed after a fetch. The following converters have been added to HAProxy 1.9:

Converter Description strcmp Compares the contents of <var> with the input value of type string. concat Concatenates up to three fields after the current sample which is then turned into a string. length Return the length of a string. crc32c Hashes a binary input sample into an unsigned, 32-bit quantity using the CRC32C hash function. ipv6 added to “ipmask” converter Apply a mask to an IPv4/IPv6 address and use the result for lookups and storage. field/word converter extended Extended so it’s possible to extract field(s)/word(s) counting from the beginning/end and/or extract multiple fields/words (including separators).

Miscellaneous Improvements

Other, miscellaneous improvements were added to this version of HAProxy. They include:

New stick table counters, gpc1 and gpc1_rate , are available.

and , are available. The resolvers section now supports resolv.conf.

section now supports busy-polling – allows reduction of request processing latency by 30 – 100 microseconds on machines using frequency scaling or supporting deep idle states.

– allows reduction of request processing latency by 30 – 100 microseconds on machines using frequency scaling or supporting deep idle states. The following updates were made to the Lua engine within HAProxy: The Server class gained the ability to change a server’s maxconn value. The TXN class gained the ability to adjust a connection’s priority within the server queue. There is a new StickTable class that allows access to the content of a stick-table by key and allows dumping of the content.



Regression Test Suite

Regression testing is an extremely important part of releasing quality code. Being able to create tests that cover a wide range of code is powerful in not only preventing past bugs from being reintroduced but also helps in detecting any new ones.

Varnish ships with a tool named varnishtest that’s used to help do regression testing across the Varnish codebase. After reviewing this tool we found it to be the perfect candidate for HAProxy-specific tests. We worked with the Varnish team and contributed patches to varnishtest that allow it to be extended and used with HAProxy.

We’ve also begun creating and shipping tests with the code that can be run within your environment today. The tests are quite easy to create once you have an understanding of them. So, if you are interested in contributing to HAProxy but don’t know where to start, you might want to check them out and try creating your own tests!

To begin using the regression testing suite, you will want to install varnishtest, which is provided with the Varnish package. Once that has been installed, you will want to create a test vtc file. Here is a sample:

To run this test, you would set the HAPROXY_PROGRAM environment variable to the path to the binary you’d like to test. Then call varnishtest.

HAProxy 2.0 Preview

HAProxy 1.9 will allow us to support the latest protocols and features that are becoming a necessity in the rapidly evolving technology landscape. You can expect to see the following features in HAProxy 2.0, which is scheduled to be released in May 2019:

HAProxy Data Plane API

gRPC

Layer 7 Retries

Stay tuned, as we will continue to provide updates the closer we get to our next release!

Conclusion

HAProxy remains at the forefront of performance and innovation because of the commitment of the open-source community and the staff at HAProxy Technologies. We’re excited to bring you this news of the 1.9 release!

It paves the way for many exciting features and begins a new chapter in which you’ll see more frequent releases. It immediately brings support for end-to-end HTTP/2, improved buffering and connection management, updates to the runtime API and Small Object Cache, a new random load balancing algorithm, and even better observability via the runtime API and new fetch methods.

You will quickly see many of these advancements in HAProxy Enterprise as we backport them to the pending HAProxy Enterprise 1.8r2 release. Our philosophy is to always provide value to the open-source community first and then rapidly integrate features into the Enterprise suite, which has a focus on stability. You can compare versions on the Community vs Enterprise page.

Want to stay in the loop about content like this? Subscribe to our blog or follow us on Twitter. You can also join us on Slack. HAProxy Enterprise combines HAProxy with enterprise-class features and premium support. Contact us to learn more or sign up for a free trial today!