A few years back I had blogged about creating your own 3rd Party vSphere Content Library enabling customers to take advantage of different types of storage backing than just vSphere Datastores. The primary requirement was that the content endpoint was accessible over HTTP(s), which meant that a number of solutions could be used from a simple web server like Nginx to an advanced distributed object store like Amazon S3 for example.

The workflow to create a 3rd Party vSphere Content Library on S3 is fairly straight forward, here is high level summary:

Organize the content on a local system (desktop) Run a python script to index and generate the Content Library metadata Upload the Content Library to S3



A disadvantage of the above solution is that each time you need to update or remove content, the entire process would have to be repeated again, including re-uploading the changes. Not only was this time consuming from an operational standpoint but now you also needed to also keep a full copy of all the content locally which can be several hundred gigabytes, if not more.

This topic was recently brought back up again by Gilles Chekroun, an SE in our Networking and Security Business Unit who reached out to see if there was a solution to help his customer who was running into this challenge. Over the last couple of weeks, I had been working with both Gilles and Eric Cao (Content Library Engineer) on how we could enhance the existing Python script which indexes and generates the Content Library metadata to also support running directly on Amazon S3 bucket.

A huge thanks to Eric for the script enhancements, the new version of the script which you can download here can now index content both locally as well as a remote S3 bucket. There is really neat side affect of this enhancement for our VMware Cloud on AWS (VMC) customers which I think is quite interesting. In case you did not know, S3 usage (ingress/egress) from a customers SDDC is 100% free for VMC customers by simply using a linked S3 endpoint to your VPC. This means you can take advantage of S3 to store your templates, ISOs and other static files, which can also be shared by other SDDCs. This means you are not consuming any of your primary storage for static content and can be used for what it was meant, for your workloads. This is pretty cool if you ask me!

The new workflow to create a 3rd Party vSphere Content Library on S3 is as follows:

Upload and organize the content on S3 Run a python script to index and generate the Content Library metadata



With the ability to now remotely index and generate the Content Library metadata files, it means you no longer have to keep a local copy of all your content. All changes can be made directly on the S3 bucket and then you simply re-running the script to generate the updated metadata which can even be scheduled as a simple cron job. Gilles also did a nice write-up here which walking you step by step from S3 bucket creation including permissions to running the script and then consuming the 3rd Party vSphere Content Library in VMC, definitely recommend a read if you are using this for the first time whether you are a VMC customer or not. I think this is just the first step on some really interesting innovations that we can drive with vSphere Content Library and taking advantage of solutions like Amazon S3. In fact, this is quite timely as Jon Kensy, a VMware customer recently published an article here sharing his own thoughts on what native S3 support from vSphere Content Library could look like and its benefits both to on-premises but also VMware customers. What do you think?