Forward Proxy

A forward proxy is a server that carries out requests on behalf of a well-defined set of clients to arbitrary servers on the Internet.

For example, if you are using a forward proxy, and you request http://google.com/ from your browser, then your browser doesn’t do this request itself, but forwards it to the forward proxy. The forward proxy performs the request to the google.com webserver, gets the response, and returns this response back to your browser.

The webserver at google.com doesn’t know that the request came originally from you. It can only see the IP address of the forward proxy as the source IP address of the request, but not the IP address of your laptop. So, for the webserver, it looks like the request originated from the forward proxy.

Consequently, one common use case for using forward proxy is hiding your IP address from webservers.

Reverse Proxy

A reverse proxy is a server that receives requests from arbitrary clients over the Internet, and forwards them to one of a well-defined set of servers.

For example, if you have a reverse proxy that runs in front of your webserver for your website http://mydomain.com , then any request to http://mydomain.com is actually received by your reverse proxy, and not by your webserver. The reverse proxy then forwards this request to your webserver, the webserver handles it, returns the response to your reverse proxy, and the reverse proxy returns the response back to the client.

The client making the request doesn’t know that it is actually interacting with a reverse proxy rather than directly with your webserver.

A common use case for a reverse proxy is to make it handle certain tasks so that the application servers behind the reverse proxy don’t need to handle them. One example of such a task is handling the TLS network layer (for example, to implement HTTPS). This is what we are going to do in this article.

A reverse proxy can also be used for load balancing. For example, if you have multiple webservers behind your reverse proxy that serve the same website, then the reverse proxy can distribute incoming requests evenly over these webservers, so that the risk of a single webserver getting overloaded is reduced.

Getting a TLS Server Certificate

In order to use HTTPS, we need to get a public key certificate for the domain that we want to use for our webserver. But to do this, we first need to understand what HTTPS really is.

What is HTTPS?

HTTPS can be rephrased as “HTTP over TLS”. TLS stands for Transport Layer Security and is a cryptographic network protocol layer that can be inserted between the transport layer (e.g. TCP) and the application layer (e.g. HTTP) of the network protocol stack. The following figure illustrates this:

HTTP (left) vs. HTTPS (right) protocol stacks.

Note: you might often hear the term SSL (Secure Sockets Layer). Indeed, the term SSL is often used interchangeably with TLS, but this is wrong. SSL is the predecessor protocol of TLS and is now deprecated.

So, if HTTP is used alone, we talk about HTTP. If HTTP is used on top of TLS, we talk about HTTPS. In some more detail, if we use HTTP, then the HTTP layer passes messages in in plain text down to the TCP layer (and the same way back when receiving messages). On the other hand, if we use HTTPS, then the HTTP layer passes the plain text messages to to the TLS layer, TLS encrypts these messages and passes them as ciphertexts to TCP (and the same way back when receiving messages).

TLS uses a symmetric encryption algorithm to encrypt messages. This algorithm, as well as the symmetric key, are negotiated between the client and the server during an initial handshake. The symmetric encryption provides confidentiality of the communication.

In addition to confidentiality, TLS also provides server authentication. This means that a server (e.g. webserver) has to prove its identity to the client (e.g. browser). This is done by public key cryptography and a TLS server certificate. I will explain in the next section how this works.

TLS also defines client authentication. That means that a client (e.g. browser) has to prove its identity to the server (e.g. webserver) by the means of a TLS client certificate. However, this feature is rarely used in practice.

TLS Server Authentication

A server that wants to authenticate itself to its clients needs to have a private and a public key, as well as a certificate for the public key. Upon connection with a client, the client sends a so-called challenge, an arbitrary piece of data, to the server, and requests the server to encrypt this piece of data with the server’s private key.

The server then sends the encrypted challenge back to the client, along with the server certificate. The client then decrypts the encrypted challenge with the server’s public key (which is in the certificate). If the decryption succeeds, the client knows that the server it is talking to must be really the subject that is listed in the certificate. This is because only this subject is supposed to have the private key that corresponds to the public key in the certificate.

What is a Public Key Certificate?

The following is a short excursus into public-key cryptography. It explains how certificates work, including the concepts of chain of trust and root certificate.

If you don’t want to know how certificates work, but just want to get one, you can jump directly to “How To Get a Certificate?”.

A public key certificate is a text document that contains three pieces of information:

A description of a subject (for example, an individual, a company, or a webserver identified by a domain name)

(for example, an individual, a company, or a webserver identified by a domain name) A public key

A digital signature of a trusted third-party that attests that this public key belongs to this subject

In fact, all that a certificate does is confirming that a given public key belongs to a given subject.

For example, if a subject is called Google and has a public key P, then a corresponding certificate would say: “I confirm that the public key P belongs to Google”.

The trustworthiness of a certificate relies on the digital signature of a trusted third-party that confirms the content of the certificate. This digital signature is provided by the issuer of the certificate. An issuer of certificates is called a certificate authority (CA).

A chain of trust consisting of three certificates (viewed in Chrome). From bottom to top: 1) a certificate for a domain name issued by Let’s Encrypt certificate authority. 2) Let’s Encrypt’s own certificate issued by a root certificate authority. 3) The root certificate authority’s root certificate.

A digital signature of a document is typically nothing else than a hash of the document encrypted with the signer’s private key. If this signature can be decrypted with the signer’s public key, then we know that the signature must have been made by this signer, because this signer is the only subject that has access to this private key.

When a client validates the certificate of a server, it also has to validate the digital signature of the CA on this certificate. This is done, as mentioned, by trying to decrypt the signature with the CA’s public key (which is publicly known). But now the client seems to be back on square one: it has a subject (the CA) and a public key that claims to be the CA’s public key, but no proof that this public key really belongs to this CA.

The solution is yet another certificate. Namely a certificate issued by yet another CA, say CA-2, that confirms that the public key, that claims to be the public key of the first CA, say CA-1, is really CA-1’s public key. And this certificate bears the digital signature of CA-2. But as a client we still can’t stop here. We also have to verify the signature of CA-2 on this certificate. For this we need CA-2’s public key (which is publicly available). And now again, we have a subject (CA-2) and a public key, but no proof that this public key really belongs to CA-2. The solution is… yet another certificate. Namely a certificate issued by CA-3 confirming that the public key in question really is the public key of CA-2. And this certificate bears the digital signature of CA-3, which we in turn need to verify.

The following figure illustrates this process: