S3 through nginx ingress

Proxy a S3 bucket with an ingress and external serviceBy vicjicama

Introduction

This post is about how to proxy a s3 bucket using a kubernetes ingress and an external service. I have a bucket named resources.repoflow.com with all the css and js file that I need to use those resources in the blog and the linker services, instead of directly access the resources thought the bucket url like this:

http://s3.amazonaws.com/resources.repoflow.com/resources/react-toastify/dist/ReactToastify.css

I wanted to access the resources like this:

https://blog.repoflow.com/resources/react-toastify/dist/ReactToastify.css

In case I need to change the source of the resources I just need to configure the ingress and all the other service should not be affected.

At first I used a node server that was serving static routes...

Then I had a container with an nginx configured to upstream the s3 bucket to the needed route (without rewrite)...

Now, as mentioned on this post I am using an external service...

Kubernetes entities

We need to set the upstream-vhost and rewrite-target annotations for the ingress

The upstream-vhost annotation will replace the defined ingress host, in this case s3-proxy.repoflow.com to s3.amazonaws.com

The rewrite-target annotation will replace the $2 in the annotation with the first captured group (.*), check the troubleshooting section for more details on the (/|$) part of the path.

apiVersion: v1 kind: Ingress metadata: name: proxy-ingress namespace: repoflow-s3-proxy-demo annotations: nginx.ingress.kubernetes.io/rewrite-target: /resources.repoflow.com/resources/$2 nginx.ingress.kubernetes.io/upstream-vhost: s3.amazonaws.com spec: rules: - host: s3-proxy.repoflow.com http: paths: - path: /resources(/|$)(.*) backend: serviceName: resources-external-service servicePort: 80

For the external service you need to set the externalName with a DNS name, in this case set it to s3.amazonaws.com, the external service is also useful when you want to access a service from another namespace, you can use the cluster dns name like service-name.namespace.svc.cluster.local and use the service port as the targetPort.

apiVersion: v1 kind: Service metadata: name: resources-external-service namespace: repoflow-s3-proxy-demo spec: type: ExternalName externalName: s3.amazonaws.com ports: - name: http protocol: TCP port: 80 targetPort: 80

Check the repository

Troubleshooting

During the testing this was not working at the beginning, I saw that I was getting an application/x-directory response on the browser... I had an issue with the path rewrite... in order to check which is the rewrite result I added the next configuration-snippet to the ingress:

apiVersion: v1 kind: Ingress metadata: name: proxy-ingress namespace: repoflow-s3-proxy-demo annotations: nginx.ingress.kubernetes.io/rewrite-target: /resources.repoflow.com/resources/$2 nginx.ingress.kubernetes.io/upstream-vhost: s3.amazonaws.com nginx.ingress.kubernetes.io/configuration-snippet : | rewrite_log on; spec: rules: - host: s3-proxy.repoflow.com http: paths: - path: /resources(/|$)(.*) backend: serviceName: resources-external-service servicePort: 80

This is a part of the nginx-ingress log, you can see that the rewritten data was not correct... all resource path was missing.

[notice] 333#333: *17610 "(?i)/resources/(.*)" matches "/resources/react-toastify/dist/ReactToastify.css", client: 10.244.0.28, server: blog-stage.repoflow.com, request: "GET /resources/react-toastify/dist/ReactToastify.css HTTP/2.0", host: "blog-stage.repoflow.com" 2020/01/07 22:37:54 [notice] 333#333: *17610 rewritten data: "/resources.repoflow.com/resources/", args: "", client: 10.244.0.28, server: blog-stage.repoflow.com, request: "GET /resources/react-toastify/dist/ReactToastify.css HTTP/2.0"

I saw how to solve this issue on this stack overflow Q/A:

ingress-nginx docs : Starting in Version 0.22.0, ingress definitions using the annotation nginx.ingress.kubernetes.io/rewrite-target are not backwards compatible with previous versions. In Version 0.22.0 and beyond, any substrings within the request URI that need to be passed to the rewritten path must explicitly be defined in a capture group.

I need to change the ingress path from /resources/(.*) to /resources(/|$)(.*). Now the rewritten data was showing the right value and the request were returning the expected resource.

[notice] 189#189: *16440 "(?i)/resources (/|$) (.*)" matches "/resources/react-toastify/dist/ReactToastify.css", client: 10.244.0.28, server: blog-stage.repoflow.com, request: "GET /resources/react-toastify/dist/ReactToastify.css HTTP/2.0", host: "blog-stage.repoflow.com" 2020/01/07 22:36:26 [notice] 189#189: *16440 rewritten data: "/resources.repoflow.com/resources/ react-toastify/dist/ReactToastify.css" , args: "", client: 10.244.0.28, server: blog-stage.repoflow.com, request: "GET /resources/react-toastify/dist/ReactToastify.css HTTP/2.0"

Testing

I used a minikube instance that is living on an old pc to test this concept.To have this example working all you need to do is apply the entities.yaml file, this will create the namespace, the external service and ingress previously described.

kubectl apply -f entities.yaml

Since the minikube instance is in another machine you need to forward the minikube ingress to you local machine and then modify the /etc/hosts file to add the ingress host. I do this kind of forwards for my local machine using the linker tool

Forward the ingress in this way is very useful for example if you want to test something using same url than the production cluster but in a minikube instance.

Conclusion

Using the ingress definitions to proxy external services have a lot of possibilities, I will keep improving the way those resources are shared across multiple service and projects and sharing the findings.

If you are interested you can find the code for this blog that is running in kubernetes, all the entities, the containers, graphql and node services here: microservices

If you have any suggestions for any change or post, if have any feedback or if you want to reach out don't hesitate to contact me my email is vic@repoflow.com