A container registry is best thought of as an artifact store for Docker images. Once your CI system has validated the container image for behavior, performance, content and security, it needs a place to put the known-good image. A registry is the place to store it.

This article will take a look a the private Docker registry. Please add other solutions and patterns in the comments if your tool or topology is not included. Bonus points if you have a technical write up on how to use it that can be linked to.

Your Very on Private Registry

Many organizations cannot use a SaaS based private artifact store for reasons of culture, policy or performance. In these case, they must run their own registry and the attention shifts to how this might be done.

The emerging best practice today appears to be running a registry process on each host in a compute pool while storing image data in a global storage provider such as S3 (public cloud) or Swift (private cloud).

Let’s take a look at the more common registry deployment patterns along with their pros and cons.

Single Container, Local Storage

This is not a recommended way to run a registry for any serious workload. It is, however, handy for standing one up quickly to do the research around build pipelines and workflow implications.

Pros

Trivially easy to do when you are solving a broader problem at first

Cons

It is in one place, thus subject to scaling issues

It is in one place, thus a single point of failure

If using dedicated data volume the registry and data containers must have host affinity

Must secure with TLS. Often done with host level (rather than container level) certs. This creates a host level configuration management issue (Puppet, Chef, et al)

Skipping TLS means each docker daemon in the host pool must be run in –insecure-registry mode. Also a configuration management issue. Using any flag with the word insecure in it should be viewed with a healthy amount of skepticism!

An orchestration engine must be told where the registry is. To solve a general case this is usually done with service discovery, thus adding complexity.

Upgrading the registry means a loss of data or the use of a dedicated data volume (see below)

You can find simple directions for running a registry this way here.

Single Container with Dedicated Data Volume

It is generally a good idea to separate and application from its data. If this is not clear, check out the 12 Factor method for more detail. Basically it boils down to the ease by which the application can be upgraded and the cattle-like nature of application containers.

Pros

Easy to deploy

Easy to upgrade the registry process with a new container (e.g. when version 2.3 comes out)

Cons

The container is a single point of failure.

This method only mades sense if both the registry and the volume container are on the same host

Affinity between the registry and the data container adds orchestration complexity

Backing up data is cludgy

All the same TLS and –insecure-registry issues apply

Must find the registry via service discovery or hard code its location in the orchestration layer.

A typical registry will take up many gigabytes of storage. Containers of such size are not easy to manage.

The simple instructions above can be combined with this documentation on data volume containers to try this deployment pattern.

Single Container, Host Storage

Writing data to a host volume is another way to think about running a registry. This has many of the same pitfalls as using a dedicated data container (see above).

Pros

Trivial to Deploy

Data backups can be done via common practices

Cons

The container is a single point of failure.

The registry container must be on the same host as the data, thus causing orchestration complexity

All the same TLS and –insecure-registry issues apply

Must find the registry via service discovery or hard code its location in the orchestration layer.

If global storage such as s3 is not available, this is probably the best option, in terms of durability, for your registry deployment.

The simple instructions linked above along with a simply `–volumes` switch make this possible.

Single Container, Global storage

Run the registry on a single — possibly dedicated — host, but use S3, Swift, Azure, Ceph etc back end.

This addresses concerns about treating containers — and hosts — as pets (rather than cattle) for the storage aspect. However the host running the registry is something of a snowflake (pet) itself.

Pros

Data is handled in a known good way in terms of availability and backups. There is a great deal of lore around backing up S3, Ceph et al.

No host affinity necessary for the registry container.

All the same TLS and –insecure-registry issues apply

Must find the registry via service discovery or hard code its location in the orchestration layer.

Cons

The container is a single point of failure.

To really gain IO boosts there is a need to run a blob cache. This means a host affinity between the registry and blob cache containers.

Must find the registry via service discovery or hard code its location in the orchestration layer.

Registry Container per Host

This will likely become a best practice for running a private registry over time. In this scenario, a registry container is run on each host in a compute pool. Data is stored in global storage such as S3, Ceph etc.

In fully realized environments of disposable compute, a host could be killed if its registry container fails to restart if it dies.

Pros

No need to set up TLS or use –insecure-registry

The registry is always available at localhost , thus simplifying the orchestration of starting and upgrading services in an application.

Ease of deployment – a good orchestration tool will start a registry on each new host.

The blog cache can be deployed on a per host basis as well.

Cons

It is unclear from the documentation if any ACID concerns are in play with multiple processes writing to a single back end. (Multiple use cases have yet to indicate a problem)

For an example of how to run a registry in this mode, check out this article from StackEngine.

Conclusion

Many organizations want to run their own private image registry for various reasons. While there are many patterns to follow, the one with the most promise at this time appears to be running the registry process on each host in a compute pool while connecting to shared backend storage.