We recently needed to demonstrate AWS RDS for a customer’s existing Oracle database running in their colo datacenter. Their Oracle DB dump was about 200 GB in size and had to be moved to an AWS account securely.

Let’s first discuss the existing options and why it wasn’t right for our situation and then we will explain how we solved it using S3 direct upload using Cognito authentication.

Since we were dealing with large files, we wanted our customer to upload the files directly to Amazon S3. But unfortunately, our customer is relatively new to AWS and training them to upload using AWS CLI or the Management Console would delay the project so we started looking for alternate options.

Problem statement: A customer needed to transfer an Oracle database dump of 200 GB securely to an AWS account.

We considered Cyberduck as our second option. Cyberduck is an open source client for FTP and SFTP, WebDAV, and cloud storage, available for macOS and Windows. It supports uploading to S3 directly using AWS credentials. We could create a new IAM user with limited permission and share the credentials with the customer, along with credentials we need to share S3 bucket and folder names. But again in this solution also, the customer needs to install an external software installation and then follow certain steps to upload the files. It meant they had to take approvals to install software, and that was adding to the delay. This may be slightly easy compared to the first option but still introduced a lot of friction.

While investigating further for a friction-free solution, we discovered that we can directly upload files into S3 from the browser using multi-part upload. Initially, we were doubtful if this will work for large files as browsers usually have limitations on the size of the file that can be uploaded. We thought unless we try it, we will never know so we decided to give it a shot.

We can directly upload files from the browser to S3 but how to make it secure?

Browsers expose the source code so obviously, we can’t put credentials in the source and we thought we should use S3 Signed URLs and very soon we realized that we need to pre-define the object key/filename to be stored while generating the pre-signed URL, which is again not a very desirable option for us. In order to make this process dynamic in our Serverless website, we need to write an AWS Lambda function which can generate the pre-signed URL based on file name the user provides, and call it using API gateway. While this is a possible solution, we found a better solution using Amazon Cognito.

Cognito has user pools and identity pools. User pools are for maintaining users and identity pools are for generating temporary AWS credentials using several web identities including Cognito user identity. We created a user pool in Cognito and associated it to an identity pool. Identity pool provides credentials to both authenticated and unauthenticated users based on associated IAM roles and policies. Now any valid user in our Cognito user pool can get temporary AWS credentials using the associated identity pool and use these temporary credentials to directly upload files to S3.