Background

Here at Kloudless, we provide a Docker Container for Kloudless Enterprise that makes it easy to manage a Kloudless Enterprise cluster using industry standard tools like Docker Swarm or Kubernetes.

For some background, Kloudless provides a unified API that enables engineering teams to quickly integrate several software apps with a single implementation. This means our primary users are other developers and engineering teams.

However, downloading the container from our Kloudless Enterprise web portal is inconvenient. Users previously had to download the archived image and manually load it into their Docker daemon to use it. There also wasn’t a way to check which images were available without visiting the portal through a browser. To improve the experience, we decided to provide a private Docker Registry that would allow our users to not only pull images, but also query tags and take advantage of other useful features that the Docker Registry provides.

Private Docker Registry Architecture

To reduce our operational load, we use the Elastic Container Registry (ECR) that AWS provides as a managed Docker Registry. This allows us to work with Docker images without having to worry about maintaining the registry service or the underlying storage.

The primary concern is authenticating end-user access to this registry. ECR relies on short-lived auth tokens that are valid for 12 hours. This is problematic since we either have to provision per-user IAM accounts for each user accessing our registry, or repeatedly provide an auth token our app generates from our IAM credentials. Neither of those cases are very desirable, so we thought of an alternative.

Fortunately, our own application’s API provides tokens that authorize our users to access and manage different aspects of our platform. One way to leverage this is to have nginx accept API requests to our Docker Registry from clients that authenticate using our API’s tokens instead, and then replace the Kloudless tokens with the Docker ECR auth token. The infrastructure is roughly as shown below:

ECR Authentication

It is straightforward to manage the proxy’s access to ECR. Since we are running the server in EC2, we can create an IAM role to read the relevant repository, describe repositories, and provision authorization tokens for ECR:

{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "ecr:GetDownloadUrlForLayer", "ecr:BatchGetImage", "ecr:CompleteLayerUpload", "ecr:DescribeImages", "ecr:ListImages", "ecr:InitiateLayerUpload", "ecr:BatchCheckLayerAvailability", "ecr:GetRepositoryPolicy", ], "Resource": [ "arn:aws:ecr:YOUR_REGION:123456789012:repository/YOUR_REPO", "arn:aws:ecr:YOUR_REGION:123456789012:repository/YOUR_REPO/*", ] }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": [ "ecr:GetAuthorizationToken", "ecr:DescribeRepositories" ], "Resource": "*" } ] } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 { "Version" : "2012-10-17" , "Statement" : [ { "Sid" : "VisualEditor0" , "Effect" : "Allow" , "Action" : [ "ecr:GetDownloadUrlForLayer" , "ecr:BatchGetImage" , "ecr:CompleteLayerUpload" , "ecr:DescribeImages" , "ecr:ListImages" , "ecr:InitiateLayerUpload" , "ecr:BatchCheckLayerAvailability" , "ecr:GetRepositoryPolicy" , ] , "Resource" : [ "arn:aws:ecr:YOUR_REGION:123456789012:repository/YOUR_REPO" , "arn:aws:ecr:YOUR_REGION:123456789012:repository/YOUR_REPO/*" , ] } , { "Sid" : "VisualEditor1" , "Effect" : "Allow" , "Action" : [ "ecr:GetAuthorizationToken" , "ecr:DescribeRepositories" ] , "Resource" : "*" } ] }

The server needs to store and refresh ECR authorization tokens to allow nginx to perform requests to ECR. A simple cron job that executes every 8 hours handles this process:

#!/bin/bash : ${AWS_REGION:="YOUR_REGION"} : ${AWS_ECR_ID="YOUR_ECR_ID"} # Logging output to syslog instead of spamming cron emails : ${LOGGER_TAG:="refresh_ecr_token"} exec 1> >(logger -t "$LOGGER_TAG" -p cron.info) exec 2> >(logger -t "$LOGGER_TAG" -p cron.err) # Generating the token TOKEN=$(aws ecr get-authorization-token --region "${AWS_REGION}" --registry-ids "${AWS_ECR_ID}" --output text --query 'authorizationData[].authorizationToken') FILE="/etc/nginx/conf.d/ecr_token" echo "${TOKEN}" > "${FILE}" chmod go+r "${FILE}" 1 2 3 4 5 6 7 8 9 10 11 12 #!/bin/bash : $ { AWS_REGION : = "YOUR_REGION" } : $ { AWS_ECR_ID = "YOUR_ECR_ID" } # Logging output to syslog instead of spamming cron emails : $ { LOGGER_TAG : = "refresh_ecr_token" } exec 1 > > ( logger - t "$LOGGER_TAG" - p cron . info ) exec 2 > > ( logger - t "$LOGGER_TAG" - p cron . err ) # Generating the token TOKEN = $ ( aws ecr get - authorization - token -- region "${AWS_REGION}" -- registry - ids "${AWS_ECR_ID}" -- output text -- query 'authorizationData[].authorizationToken' ) FILE = "/etc/nginx/conf.d/ecr_token" echo "${TOKEN}" > "${FILE}" chmod go + r "${FILE}"

Proxying the requests

Since our customers only require read access, we can directly proxy the Docker Registry API requests and replace the authentication—after validating the token of course. We take advantage of the ngx-lua module to handle this within nginx itself. The OpenResty framework includes this library by default, but it is possible to install it separately as well. The following configuration snippet demonstrates how to safely proxy the Docker Registry API requests:

lua_package_path "/usr/local/lib/lua/?.lua;;"; map $upstream_http_docker_distribution_api_version $docker_distribution_api_version { '' 'registry/2.0'; } init_by_lua ' -- External library for JSON parsing. local json = require("JSON") -- External lib for loading the token. local aws = require("aws") aws.get_ecr_token("/etc/nginx/conf.d/") '; server { listen 80; server_name _; # AWS internal resolver resolver 169.254.169.253; # Disallowing client bodies client_max_body_size 0; location /health { return 200; } # Kloudless API Endpoint for later validation location /v1/meta/licenses/ { internal; set $server 'api.kloudless.com:443'; proxy_pass https://$server; } # Docker Registry API Endpoints location ~* ^/v2/(?<channel>[a-z0-9_-]*)?(/.*)?$ { if ($http_user_agent ~ "^(docker\/1\.(3|4|5(?!\.[0-9]-dev))|Go ).*$") { return 404; } access_by_lua ' -- Making sure that no modification requests can take place local method_blacklist = {"POST": 1, "DELETE": 1} if method_blacklist[ngx.var.request_method] then ngx.exit(403) end -- Handle login process. Returning 401 causes docker CLI to prompt user. if ngx.var.http_authorization == nil then ngx.header["WWW-Authenticate"] = "Basic realm=kloudless" ngx.exit(401) end -- !!! TODO: See the next blog section for Kloudless Auth. -- ... -- Get the AWS ECR HTTP API token and modify the Authorization header -- again using this token, so that upstream requests to the ECR succeed. -- The token expires every 12 hours, thus other means are required to -- update the token in the file. local aws = require("aws") local ecr_token = aws.get_ecr_token("/etc/nginx/conf.d/ecr_token") ngx.req.set_header("Authorization", string.format("Basic %s", ecr_token)) '; add_header 'Docker-Distribution-Api-Version' $docker_distribution_api_version; proxy_pass https://[YOUR_ECR_ID].dkr.ecr.[YOUR_REGION].amazonaws.com; proxy_set_header Host "[YOUR_ECR_ID].dkr.ecr.<your_REGION>.amazonaws.com"; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-User $http_authorization; proxy_set_header X-Forwarded-Proto "https"; proxy_pass_header Server; proxy_read_timeout 900; } } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 lua_package_path "/usr/local/lib/lua/?.lua;;" ; map $ upstream_http_docker_distribution_api_version $ docker_distribution_api_version { '' 'registry/2.0' ; } init_by_lua ' -- External library for JSON parsing. local json = require("JSON") -- External lib for loading the token. local aws = require("aws") aws.get_ecr_token("/etc/nginx/conf.d/") ' ; server { listen 80 ; server_name _ ; # AWS internal resolver resolver 169.254.169.253 ; # Disallowing client bodies client_max_body_size 0 ; location / health { return 200 ; } # Kloudless API Endpoint for later validation location / v1 / meta / licenses / { internal ; set $ server 'api.kloudless.com:443' ; proxy_pass https : //$server; } # Docker Registry API Endpoints location ~ * ^/ v2 / ( ? < channel > [ a - z0 - 9_ - ] * ) ? ( / . * ) ? $ { if ( $ http_user_agent ~ "^(docker\/1\.(3|4|5(?!\.[0-9]-dev))|Go ).*$" ) { return 404 ; } access_by_lua ' -- Making sure that no modification requests can take place local method_blacklist = {"POST": 1, "DELETE": 1} if method_blacklist[ngx.var.request_method] then ngx.exit(403) end -- Handle login process. Returning 401 causes docker CLI to prompt user. if ngx.var.http_authorization == nil then ngx.header["WWW-Authenticate"] = "Basic realm=kloudless" ngx.exit(401) end -- !!! TODO: See the next blog section for Kloudless Auth. -- ... -- Get the AWS ECR HTTP API token and modify the Authorization header -- again using this token, so that upstream requests to the ECR succeed. -- The token expires every 12 hours, thus other means are required to -- update the token in the file. local aws = require("aws") local ecr_token = aws.get_ecr_token("/etc/nginx/conf.d/ecr_token") ngx.req.set_header("Authorization", string.format("Basic %s", ecr_token)) ' ; add_header 'Docker-Distribution-Api-Version' $ docker_distribution_api_version ; proxy_pass https : //[YOUR_ECR_ID].dkr.ecr.[YOUR_REGION].amazonaws.com; proxy_set_header Host "[YOUR_ECR_ID].dkr.ecr.<your_REGION>.amazonaws.com" ; proxy_set_header X - Real - IP $ remote_addr ; proxy_set_header X - Forwarded - For $ proxy_add_x_forwarded_for ; proxy_set_header X - Forwarded - User $ http_authorization ; proxy_set_header X - Forwarded - Proto "https" ; proxy_pass_header Server ; proxy_read_timeout 900 ; } }

nginx doesn’t read from the file containing the ECR token on each request to ensure the requests are handled efficiently. Instead, nginx loads the tokens when the file changes. The following lua file referenced above by require("aws") handles this:

local _M = {} local ecr_token = "" local last_mtime = 0 function _M.read_ecr_token(path) local f = io.open(path) local token = f:read("*all") f:close() return token end function _M.get_ecr_token(path) local f = io.popen("stat -c %Y " .. path) local mtime = tonumber(f:read()) f:close() if mtime > last_mtime then ecr_token = _M.read_ecr_token(path) last_mtime = mtime end return ecr_token end return _M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 local _M = { } local ecr_token = "" local last_mtime = 0 function _M . read_ecr_token ( path ) local f = io . open ( path ) local token = f : read ( "*all" ) f : close ( ) return token end function _M . get_ecr_token ( path ) local f = io . popen ( "stat -c %Y " . . path ) local mtime = tonumber ( f : read ( ) ) f : close ( ) if mtime > last_mtime then ecr_token = _M . read_ecr_token ( path ) last_mtime = mtime end return ecr_token end return _M

Custom Kloudless Authentication

The nginx configuration displayed earlier uses HTTP Basic Authentication to ensure compatibility with Docker command line tools. The developer’s email is the username, while their account’s API token is the password. In the access_by_lua block, nginx decodes the Basic Auth header, reads the Kloudless token, and uses that to perform a request to the Kloudless Meta API to list Licenses. This validates the token and also provides information on which Docker releases the developer has access to. The TODO comment in the earlier nginx config contains the following snippet to authorize the requests using nginx sub-requests:

function has_access(lks_string, channel) local lks = json:decode(lks_string) if lks["objects"] == nil then return false end for _, obj in pairs(lks["objects"]) do if obj["release"] == channel then return true end end return false end -- Get the login credentials and decode them local encoded_creds = string.sub(ngx.var.http_authorization, 7) local decoded_creds = ngx.decode_base64(encoded_creds) if decoded_creds == nil then ngx.exit(400) end local split_creds = split(decoded_creds, ":") local token = split_creds[2] if token == nil then ngx.exit(400) end -- Our license endpoint doesn't like docker's Accept header local org_accept = ngx.req.get_headers()["Accept"] ngx.req.set_header("Accept", "application/json") -- Check if the token is valid ngx.req.set_header("Authorization", string.format("Bearer %s", token)) res = ngx.location.capture("/v1/meta/licenses/") if not (res.status == 200) then ngx.status = res.status ngx.say(res.body) ngx.exit(res.status) end -- Reset the Accept header to its original one ngx.req.set_header("Accept", org_accept) if string.len(ngx.var["channel"]) > 0 and not has_access(res.body, ngx.var["channel"]) then ngx.exit(403) end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 function has_access ( lks_string , channel ) local lks = json : decode ( lks_string ) if lks [ "objects" ] == nil then return false end for _ , obj in pairs ( lks [ "objects" ] ) do if obj [ "release" ] == channel then return true end end return false end -- Get the login credentials and decode them local encoded_creds = string . sub ( ngx . var . http_authorization , 7 ) local decoded_creds = ngx . decode_base64 ( encoded_creds ) if decoded_creds == nil then ngx . exit ( 400 ) end local split_creds = split ( decoded_creds , ":" ) local token = split_creds [ 2 ] if token == nil then ngx . exit ( 400 ) end -- Our license endpoint doesn 't like docker' s Accept header local org_accept = ngx . req . get_headers ( ) [ "Accept" ] ngx . req . set_header ( "Accept" , "application/json" ) -- Check if the token is valid ngx . req . set_header ( "Authorization" , string . format ( "Bearer %s" , token ) ) res = ngx . location . capture ( "/v1/meta/licenses/" ) if not ( res . status == 200 ) then ngx . status = res . status ngx . say ( res . body ) ngx . exit ( res . status ) end -- Reset the Accept header to its original one ngx . req . set_header ( "Accept" , org_accept ) if string . len ( ngx . var [ "channel" ] ) > 0 and not has_access ( res . body , ngx . var [ "channel" ] ) then ngx . exit ( 403 ) end

Load-balancing

Kloudless’ setup uses an ELB in front of the proxy server for high availability and easier SSL/TLS termination. This allows Kloudless to easily scale up the proxies using auto-scaling groups and handle any individual instance’s failure. It also allows Kloudless to provide a much more user-friendly host name over HTTPS such as docker.kloudless.com rather than the long ECR domain name.

Conclusion

That’s it! The configuration above is modular enough that you can substitute any authentication or authorization method you want, including a different web service or a generic database. This private, read-only registry greatly simplifies how our customers get started with Kloudless Docker containers and allows them to better utilize industry standard tools with our appliance.