Boost.Http provide some abstractions to type erase the HTTP backends (playing a similar role to std::function ). The starting point to learn about it is the server_socket_adaptor page .

If you want to create a function that will handle requests originated from different HTTP backends, you have three choices:

If chunked messages are available, you can use the following sequence of actions to respond the request:

In the previous example, we used atomic messages to respond the request. But this is limiting when we’re trying to achieve certain kind of tasks, like serving a live video stream. A second option for us is to use chunked messages to respond the request.

And, like I promised, here is the complete code (with a lot of print statements and a few lines to demonstrate more usage):

HTTP doesn’t require you to handle protocol negotiation separately from the remaining protocol or any super special handshaking flow. Therefore, we use a “naked” acceptor to fuel an usable HTTP socket. Other HTTP backends may have different usage.

Given we’re consuming the body, it’s a good idea to free unused resources. We don’t use C++11’s shrink_to_fit because it could trigger another reallocation in the next piece of body received. The idea is to reuse a small allocated buffer instead. You could also create your own message type to discard feeded bytes.

If we’re going to halt the server and want to gracefully close current connections, this code can be used to close the HTTP pipeline after the current response end. You should pay attention to safe and idempotent methods if you want to learn more about HTTP pipelining and HTTP in general.

There is a gotcha here. If you use pure asio::ip::tcp::socket , you’re subject to Asio composed operations restrictions and you should not schedule any operation while the previous one is in progress. You can wrap asio::ip::tcp::socket into some queuing socket to work around this bug and Boost.Http will give you the required customization points to allow you to use it. We don’t worry about this problem here, because with coroutines we’re done.

And then we send the response. Our application is complete.

By this time, we have already fully received the request and we can do something with it. Pretty easy, see?

This is how we check if we should send 100-continue . It must be done after async_read_request and before async_read_some .

But there is a little gotcha. You should send a 100-continue to ask for the rest of the body if required. This feature appeared first in HTTP/1.1 as a mean to better use network traffic, by allowing you to reject requests sooner.

You can use self→socket.read_state() to know which part of the request is still missing and ask for the rest.

So, the first part to actually handle is to ask for the request message. You’ll get the full method and path by the time the completion handler is called (in the case of coroutines, this means the next line of code). You must not touch any of these variables while the operation is in progress (hard to get it wrong if you’re using coroutines). self→request→headers() will also be filled and whatever part of the body that has already been received.

So, the first thing you should do is loop while the socket is open, so the whole pipelining of requests will be handled. If the connection closes ungracefully, an error code will be generated (converted to exceptions by the coroutine completion token) during one of the operations. If the connection closes gracefully, the loop will eventually stop by is_open() returning false .

You can find the whole code at the end of the secion. For now, we build upon the code from the previous code (just replace the “our code goes” here comment with the code we’ll build in this section.

We must create a reference to the shared_ptr before calling any asynchronous function to ensure the object will be live as long as the coroutine.

We spawn a new handler for every connection, so we’ll be able to handle them all concurrently.

The basic_buffered_socket<T>::next_layer() member-function will return the internal T object. The buffered_socket typedef uses asio::ip::tcp::socket as T .

Summarizing, we make use of shared_ptr to alloc object and ensure it’ll stay alive as long as there is some reference to it and we spawn an acceptor algorithm who will create a shared_ptr for every connection.

First, we’re going to write the boilerplate code Asio require us to write if we want use stackful coroutines to handle a non predetermined number of concurrent connections.

The usual setup for an HTTP application is to parse the request URL and choose an appropriate handler based on the path component. Optional middleware handlers can do some ACL based on authentication, resource usage and so on. An auxiliar database is almost always used.

If you just want to expose a bunch of files from the local filesystem, you may be interested in async_response_transmit_dir , which will take several responsibilities from you (e.g. partial download and a few basic mechanisms to avoid corrupt files).

GET is the most common HTTP verb and it has the simple semantic to retrieve the resource.

The metadata carried in every HTTP message is named as HTTP headers. The HTTP headers carry information on how to handle the data payload, with info such as the mime type, language and more. HTTP request messages can contain a token which is used to associate multiple different requests with the same client, so the user doesn’t need to type username and password for every page request.

HTTP is a protocol that is oriented to exchange of request and reply messages. Each request is independent, hence the stateless nature of the protocol. Each request has an associated verb and path, used to tell what do to who . Every HTTP message has a body (a payload of binary data) and associated metadata (i.e. a multimap<string, string> ). The HTTP response also has a status code associated which is used to indicate success of the requested action.

HTTP also provides means to upgrade a protocol for a running connection and the WebSocket protocol is gaining importance where HTTP is used the most, in the web.

HTTP is a protocol that shines in extensibility. Its 1.1 version has been used unchanged since 1997 and has been able to power very creative applications to this date. An HTTP/2.0 standard has been released, but most of the differencies are related to efficient connection management and the only feature that can affect higher-level layers of an application making use of HTTP is the HTTP push, which is used to speculatively send data to a client that the server anticipates the client will need.

A second step would be how to organize your application and make an useful application, but this second step is left for the user to tackle alone.

In this tutorial, you’ll have a small and fast introduction to HTTP (if you’re already familiar with HTTP, you can skip the associated section and it’s more likely you’ll deeply understand Boost.Http faster). You’ll also learn incrementally how to build a proper handling of HTTP requests using Boost.Http.

In the example, we associated the MyHandler object with r1 just to wrap it under poly_handler . Then we asked what was the associated allocator a and passed r2 as the fallback in case there was no such a . The expected output would be for r1 to be the same as a . And that’s just what you’ll see if you run the example. If you want the example to fail (as in discarding the associated allocator and showing a different output), just replace poly_handler with std::function .

poly_handler will also preserve the associated allocators as can be verified with the following example:

Using poly_handler , we can update the previous my_socket definition and make the DLL boundary work properly:

If you wrap our handler within a std::function object, the associated executors will be discarded and you should expect the handler to be called again:

If you also change the code to call ctx2.run() after ctx1.run() , you should expect the handler to be called. But this is just an exercise for the reader. We’re sticking with the previous code of no handler called as the desired behaviour. To be more specific, we’re interested in making sure that we can observe that execution scheduling is done through a specific executor (i.e. associated executors are respected). And, for this executor in particular, the observed output is of no handler called.

One way to verify this assertion is with a simple test case. Create two execution contexts, ctx1 and ctx2 . Post a handler to ctx1 and then call ctx1.run() . You should expect the handler to be called. Now modify handler to be wrapped and associated with an executor from ctx2 . If only this change is done, you should expect the program to exit without ever calling the handler. The following code illustrates this situation:

If I try the following, the propagation will stop at the my_socket boundary as std::function discards the associated executor:

If there is an associated executor with a completion handler, this associated executor should be used and propagated by every intermediate handler in the chain.

However, this approach is wrong. std::function does type erasure of functors, but it won’t do type erasure of associated allocators and associated executors which are a core part of Boost.Asio. This extra information will be discarded.

If you intend to wrap the template heavy usage of Boost.Asio behind a stable interface that can be accessed through dynamically loaded plug-ins, you may be tempted to use std::function :

Boost.Asio uses handlers with signatures similar to the following to notify about the completion of tasks:

2.3. Parsing (beginner)

In this tutorial, you’ll learn how to use this library to parse HTTP streams easily.

Note We assume the reader has basic understanding of C++ and Boost.Asio.

We start with the code that should resemble the structure of the program you’re about to code. And this structure is as follows:

#include <boost/http/reader/request.hpp> #include <string> #include <map> namespace http = boost :: http ; namespace asio = boost :: asio ; struct my_socket_consumer { private : http :: reader :: request request_reader ; std :: string buffer ; std :: string last_header ; public : std :: string method ; std :: string request_target ; int version ; std :: multimap < std :: string , std :: string > headers ; void on_socket_callback ( asio :: buffer data ) { using namespace http :: token ; using token :: code ; buffer . push_back ( data ); request_reader . set_buffer ( buffer ); while ( request_reader . code () != code :: end_of_message ) { switch ( request_reader . code ()) { case code :: skip : // do nothing break ; case code :: method : method = request_reader . value < token :: method > (); break ; case code :: request_target : request_target = request_reader . value < token :: request_target > (); break ; case code :: version : version = request_reader . value < token :: version > (); break ; case code :: field_name : case code :: trailer_name : last_header = request_reader . value < token :: field_name > (); } request_reader . next (); } request_reader . next (); ready (); } protected : virtual void ready () = 0 ; };

You’re building a piece of code that consumes HTTP from somewhere — the in — and spits it out in the form of C++ structured data to somewhere else — the out.

The in of your program is connected to the above piece of code through the on_socket_callback member function. The out of your program is connected to the previous piece of code through the ready overrideable member-function.

By now I shouldn’t be worried about your understanding of how you’ll connect the network I/O with the in of the program. The connection point is obvious by now. However, I’ll briefly explain the out connection point and then we can proceed to delve into the inout-inbetween (Danas) part of the program.

Once the ready member function is called, the data for your request will be available in the method , request_target and the other public variables. From now on, I’ll focus my attention to the sole implementation of my_socket_consumer::on_socket_callback .

The awaited prize void my_socket_consumer :: on_socket_callback ( asio :: buffer data ) { //http::reader::request request_reader; //std::string buffer; //std::string last_header; using namespace http :: token ; using token :: code ; buffer . push_back ( data ); request_reader . set_buffer ( buffer ); while ( request_reader . code () != code :: end_of_message ) { switch ( request_reader . code ()) { case code :: skip : // do nothing break ; case code :: method : method = request_reader . value < token :: method > (); break ; case code :: request_target : request_target = request_reader . value < token :: request_target > (); break ; case code :: version : version = request_reader . value < token :: version > (); break ; case code :: field_name : case code :: trailer_name : last_header = request_reader . value < token :: field_name > (); } request_reader . next (); } request_reader . next (); ready (); }

Try to keep in mind the three variables that will really orchestrate the flow: request_reader , buffer and last_header .

The whole work is about managing the buffer and managing the tokens.

The token access is very easy. As the parser is incremental, there is only one token at a time. I don’t need to explain Boost.Http control-flow because the control flow will be coded by you (a library, not a framework). You only have to use code() to check the current token and value<T>() to extract its value. Use next() to advance a token.

Warning There is only one caveat. The parser doesn’t buffer data and will decode the token into a value (the value<T>() member function) directly from the buffer data. This means you cannot extract the current value once you drop current buffer data. As a nice side effect, you spare CPU time for the tokens you do not need to decode (match’n’decoding as separate steps).

The parser doesn’t buffer data, which means when we use the set_buffer member function, request_reader only maintains a view to the passed buffer, which we’ll refer to as the virtual buffer from now on.

In the virtual buffer, there is head/current and remaining/tail. request_reader doesn’t store a pointer/address/index to the real buffer. Once a token is consumed, his bytes (head) are discarded from the virtual buffer. When you mutate the real buffer, the virtual buffer is invalidated and you must inform the parser using set_buffer . However, the bytes discarded from the virtual buffer shouldn’t appear again. You must keep track of the number of discarded bytes to prepare the buffer to the next call to set_buffer . The previous code doesn’t handle that.

The new tool that you should be presented now is token_size() . token_size() will return the size in bytes of current/head.

Warning There is no guarantee token_size() returns the same size as returned by string_length(request_reader.value<T>()) . You need to use token_size() to compute the number of discarded bytes.

void my_socket_consumer :: on_socket_callback ( asio :: buffer data ) { using namespace http :: token ; using token :: code ; buffer . push_back ( data ); request_reader . set_buffer ( buffer ); std :: size_t nparsed = 0 ; //< NEW while ( request_reader . code () != code :: end_of_message ) { switch ( request_reader . code ()) { case code :: skip : // do nothing break ; case code :: method : method = request_reader . value < token :: method > (); break ; case code :: request_target : request_target = request_reader . value < token :: request_target > (); break ; case code :: version : version = request_reader . value < token :: version > (); break ; case code :: field_name : case code :: trailer_name : last_header = request_reader . value < token :: field_name > (); } nparsed += request_reader . token_size (); //< NEW request_reader . next (); } nparsed += request_reader . token_size (); //< NEW request_reader . next (); buffer . erase ( 0 , nparsed ); //< NEW ready (); }

nparsed was easy. However, the while(request_reader.code() != code::end_of_message) doesn’t seem right. It’s very error-prone to assume the full HTTP message will be ready in a single call to on_socket_callback . Error handling must be introduced in the code.

void my_socket_consumer :: on_socket_callback ( asio :: buffer data ) { using namespace http :: token ; using token :: code ; buffer . push_back ( data ); request_reader . set_buffer ( buffer ); std :: size_t nparsed = 0 ; while ( request_reader . code () != code :: error_insufficient_data //< NEW && request_reader . code () != code :: end_of_message ) { //< NEW switch ( request_reader . code ()) { case code :: skip : // do nothing break ; case code :: method : method = request_reader . value < token :: method > (); break ; case code :: request_target : request_target = request_reader . value < token :: request_target > (); break ; case code :: version : version = request_reader . value < token :: version > (); break ; case code :: field_name : case code :: trailer_name : last_header = request_reader . value < token :: field_name > (); } nparsed += request_reader . token_size (); request_reader . next (); } nparsed += request_reader . token_size (); request_reader . next (); buffer . erase ( 0 , nparsed ); if ( request_reader . code () == code :: error_insufficient_data ) //< NEW return ; //< NEW ready (); }

Note Don’t worry about token_size(code::error_insufficient_data) being added to nparsed . This (error) "token" is defined to be 0-size (it fits perfectly with the other rules).

Just because it’s easy and we’re already at it, let’s handle the other errors as well:

void my_socket_consumer :: on_socket_callback ( asio :: buffer data ) { using namespace http :: token ; using token :: code ; buffer . push_back ( data ); request_reader . set_buffer ( buffer ); std :: size_t nparsed = 0 ; while ( request_reader . code () != code :: error_insufficient_data && request_reader . code () != code :: end_of_message ) { switch ( request_reader . code ()) { case code :: error_set_method : //< NEW case code :: error_use_another_connection : //< NEW // Can only happen in response parsing code. assert ( false ); //< NEW case code :: error_invalid_data : //< NEW case code :: error_no_host : //< NEW case code :: error_invalid_content_length : //< NEW case code :: error_content_length_overflow : //< NEW case code :: error_invalid_transfer_encoding : //< NEW case code :: error_chunk_size_overflow : //< NEW throw "invalid HTTP data" ; //< NEW case code :: skip : // do nothing break ; case code :: method : method = request_reader . value < token :: method > (); break ; case code :: request_target : request_target = request_reader . value < token :: request_target > (); break ; case code :: version : version = request_reader . value < token :: version > (); break ; case code :: field_name : case code :: trailer_name : last_header = request_reader . value < token :: field_name > (); } nparsed += request_reader . token_size (); request_reader . next (); } nparsed += request_reader . token_size (); request_reader . next (); buffer . erase ( 0 , nparsed ); if ( request_reader . code () == code :: error_insufficient_data ) return ; ready (); }

And buffer management is complete. However, the code only demonstrated how to extract simple tokens. Field name and field value are simple tokens, but they are usually tied together into a complex structure.

void my_socket_consumer :: on_socket_callback ( asio :: buffer data ) { using namespace http :: token ; using token :: code ; buffer . push_back ( data ); request_reader . set_buffer ( buffer ); std :: size_t nparsed = 0 ; while ( request_reader . code () != code :: error_insufficient_data && request_reader . code () != code :: end_of_message ) { switch ( request_reader . code ()) { // ... case code :: skip : break ; case code :: method : method = request_reader . value < token :: method > (); break ; case code :: request_target : request_target = request_reader . value < token :: request_target > (); break ; case code :: version : version = request_reader . value < token :: version > (); break ; case code :: field_name : case code :: trailer_name : last_header = request_reader . value < token :: field_name > (); break ; case code :: field_value : //< NEW case code :: trailer_value : //< NEW // NEW headers . emplace ( last_header , request_reader . value < token :: field_value > ()); } nparsed += request_reader . token_size (); request_reader . next (); } nparsed += request_reader . token_size (); request_reader . next (); buffer . erase ( 0 , nparsed ); if ( request_reader . code () == code :: error_insufficient_data ) return ; ready (); }

last_header did the trick. Easy, but maybe we want to separate headers and trailers (the HTTP headers that are sent after the message body). This task can be accomplished by the use of structural tokens.

void my_socket_consumer :: on_socket_callback ( asio :: buffer data ) { // NEW: // We have to declare `bool my_socket_consumer::use_trailers = false` and // `std::multimap<std::string, std::string> my_socket_consumer::trailers`. using namespace http :: token ; using token :: code ; buffer . push_back ( data ); request_reader . set_buffer ( buffer ); std :: size_t nparsed = 0 ; while ( request_reader . code () != code :: error_insufficient_data && request_reader . code () != code :: end_of_message ) { switch ( request_reader . code ()) { // ... case code :: skip : break ; case code :: method : method = request_reader . value < token :: method > (); break ; case code :: request_target : request_target = request_reader . value < token :: request_target > (); break ; case code :: version : version = request_reader . value < token :: version > (); break ; case code :: field_name : case code :: trailer_name : last_header = request_reader . value < token :: field_name > (); break ; case code :: field_value : case code :: trailer_value : // NEW ( use_trailers ? trailers : headers ) . emplace ( last_header , request_reader . value < token :: field_value > ()); break ; case code :: end_of_headers : //< NEW use_trailers = true ; //< NEW } nparsed += request_reader . token_size (); request_reader . next (); } nparsed += request_reader . token_size (); request_reader . next (); buffer . erase ( 0 , nparsed ); if ( request_reader . code () == code :: error_insufficient_data ) return ; ready (); }

Note Maybe you had a gut feeling and thought that the previous code was too strange. If trailer_name is a separate token, why don’t we use request_reader.value<token::trailer_name>() (same to trailer_value ) and go away with structural tokens? Yes, I unnecessarily complicated the code here to introduce you the concept of structural tokens. They are very important and usually you’ll end up using them. Maybe this tutorial needs some revamping after the library evolved a few times. Also notice that here you can use either request_reader.value<token::field_name>() or request_reader.value<token::trailer_name>() to extract this token value. It is as if trailer_name is “implicitly convertible” to field_name , so to speak. This feature makes the life of users who don’t need to differentiate headers and trailers much easier (with no drawback to the users who do need to differentiate them).

Some of the structural tokens' properties are:

No value<T>() associated. value<T>() extraction is a property exclusive of the data tokens.

It might be 0-sized.

They are always emitted (e.g. code::end_of_body will be emitted before code::end_of_message even if no code::body_chunk is present).

We were using the code::end_of_message structural token since the initial version of the code, so they aren’t completely alien. However, we were ignoring one very important HTTP parsing feature for this time. It’s the last missing bit before your understanding to use this library is complete. Our current code lacks the ability to handle HTTP pipelining.

HTTP pipelining is the feature that allows HTTP clients to send HTTP requests “in batch”. In other words, they may send several requests at once over the same connection before the server creates a response to them. If the previous code faces this situation, it’ll stop parsing on the first request and possibly wait forever until the on_socket_callback is called again with more data (yeap, networking code can be hard with so many little details).

void my_socket_consumer :: on_socket_callback ( asio :: buffer data ) { using namespace http :: token ; using token :: code ; buffer . push_back ( data ); request_reader . set_buffer ( buffer ); std :: size_t nparsed = 0 ; while ( request_reader . code () != code :: error_insufficient_data && request_reader . code () != code :: end_of_message ) { switch ( request_reader . code ()) { // ... case code :: skip : break ; case code :: method : use_trailers = false ; //< NEW headers . clear (); //< NEW trailers . clear (); //< NEW method = request_reader . value < token :: method > (); break ; case code :: request_target : request_target = request_reader . value < token :: request_target > (); break ; case code :: version : version = request_reader . value < token :: version > (); break ; case code :: field_name : case code :: trailer_name : last_header = request_reader . value < token :: field_name > (); break ; case code :: field_value : case code :: trailer_value : ( use_trailers ? trailers : headers ) . emplace ( last_header , request_reader . value < token :: field_value > ()); break ; case code :: end_of_headers : use_trailers = true ; } nparsed += request_reader . token_size (); request_reader . next (); } nparsed += request_reader . token_size (); request_reader . next (); buffer . erase ( 0 , nparsed ); if ( request_reader . code () == code :: error_insufficient_data ) return ; ready (); if ( buffer . size () > 0 ) //< NEW on_socket_callback (); //< NEW }

There are HTTP libraries that could adopt a “synchronous” approach where the user must immediately give a HTTP response once the ready() callback is called so the parsing code can parse the whole buffer until the end and we could just put the ready() call into the code::end_of_message case.

There are HTTP libraries that follow ASIO active style and we expect the user to call something like async_read_request before it can read the next request. In this case, the solution for HTTP pipelining would be different.

There are libraries that don’t follow ASIO style, but don’t force the user to send HTTP responses immediately on the ready() callback. In such cases, synchronization/coordination of the response generation by the user and parse resuming by the library is necessary.

This point can be rather diverse and the code for this tutorial only shows a rather quick’n’dirty solution. Any different solution to keep the parsing train at full-speed is left as an exercise to the reader.

The interesting point about the code here is to clear the state of the to-be-parsed message before each request-response pair. In the previous code, this was done binding the “method token arrived” event — the first token in a HTTP request — with such state cleanup.