How to Migrate a Ruby on Rails App from Heroku to Dokku

Direct, Secure Rails Client-side File Uploads to AWS S3 Buckets Updated May 31, 2020

8 minute read

Many Ruby on Rails apps use Amazon AWS S3 buckets for storing assets. When dealing with files uploaded by front-end web or mobile clients there are many factors you should consider to make the whole process secure and performant. In this tutorial, I will describe common pitfalls and an optimal solution when it comes to handling client-side file uploads.

I will be using Fog gem (version 2.0.0 at the time of writing) in the examples. It is a dependency of Carrierwave and Paperclip two popular file uploader gems. If you use either of them, you should already have it included in your app. If not just add:

gem 'fog'

to your Gemfile .

It is a lower level API than Carrierwave, Paperclip or Shrine gems. Getting familiar with it will help you understand how AWS S3 buckets work without all the high-level magic provided by these gems. I find it more pleasant to work with than an official Ruby AWS SDK.

AWS S3 setup

You will need Amazon AWS credentials to start working with file uploads. One common mistake is to use your primary user credentials instead of creating an IAM user with limited permissions.

If your primary credentials are compromised you could wake up with a huge bill because of Bitcoin mining bots. Losing credentials with S3 permissions only is much less severe.

Add an IAM user

You have to start by adding an IAM user and giving it a correct access policy. You can read more in detail how to do it in my other blog post. Long story short to follow the rest of this tutorial you should grant your user the following policy:

{ "Version" : "2012-10-17" , "Statement" : [ { "Effect" : "Allow" , "Action" : [ "s3:ListBucket" ], "Resource" : [ "arn:aws:s3:::test-bucket-123" ] }, { "Effect" : "Allow" , "Action" : [ "s3:PutObject" , "s3:GetObject" , "s3:DeleteObject" , "s3:PutObjectAcl" ], "Resource" : [ "arn:aws:s3:::test-bucket-123/*" ] } ] }

Make sure to modify your bucket name because they are globally unique.





AWS credentials on a client-side

You should never store your AWS credentials in a client-side app. One misconception is that you can store them in native mobile apps because they are compiled and therefore secure. Remember that there are tools which allow to decompile an app and retrieve plain text strings from its binary.

In the following examples, I will explain how to provide secure access to a private Amazon S3 buckets without exposing your credentials to a client side.

If you accidentally commit your AWS credentials to a GitHub repo make sure to remove them.

Setup a test bucket

You will need an S3 bucket to follow this tutorial. Let’s configure an API client and create an empty bucket to work with:

require 'fog' credentials = { aws_access_key_id: ENV . fetch ( "AWS_ACCESS_KEY_ID" ), aws_secret_access_key: ENV . fetch ( "AWS_SECRET_ACCESS_KEY" ), region: ENV . fetch ( 'AWS_REGION' ) } client = Fog :: Storage :: AWS . new ( credentials ) client . directories . create ({ key: "test-bucket-123" , public: false }) client . directories . map ( & :key ) # => ["test-bucket-123"]

Bucket names must be unique so make sure to name your's differently

BTW it is a good practice to use fetch for reading ENV variables. It is always better to fail fast when you forget to configure something in a given environment instead of having to track bugs caused by unexpected nil values.

Now that we have confirmed that our credentials are correct and our test bucket is ready let me describe incorrect ways to handle file uploads.

Insecure uploads to a public bucket

I will not elaborate on the client side in this tutorial. I will simulate client-side uploads with simple cURL HTTP requests. It can be done with all the front-end technologies.

JavaScript is one notable exception here. If you decide to upload files from JavaScript client remember to set the correct CORS policy on your bucket.

Here’s a sample CORS setting you might want to use:

<?xml version="1.0" encoding="UTF-8"?> <CORSConfiguration> <CORSRule> <AllowedOrigin> https://example.com </AllowedOrigin> <AllowedMethod> PUT </AllowedMethod> <AllowedMethod> GET </AllowedMethod> <AllowedHeader> Content-Type </AllowedHeader> </CORSRule> </CORSConfiguration>

Configure bucket for public uploads

ACL stands for Access Control List. It can easily be configured via Ruby SDK:

client . put_bucket_acl ( "test-bucket-123" , "public-read-write" )

Now everyone can upload a file directly to the bucket without an authentication:

curl -v --upload-file ./test_file --url https://test-bucket-123.s3.amazonaws.com/test_file

It issues a HTTP PUT request with a file binary data in its body.You should receive a 200 HTTP response. To double check that file is actually in a bucket run the following command:

curl -v https://test-bucket-123.s3.amazonaws.com/test_file

Leaving your bucket open to the public is a serious security threat. Malicious users could freely read all the data and modify it if your bucket read and write access is publicly available. It is still better than having your AWS credentials compromised but I highly discourage you from ever using this approach in production apps.

If for whatever strange reason you must use publicly open buckets in your application make sure to enable access logs for them to have at least a minimal insight and control.

Secure uploads via a Rails server

Another way to upload files from a client side is to proxy them through your Rails server. In that case, an authentication mechanism is provided by our own app, probably in form of a token or a session cookie.

This time our bucket should be set as private:

client . put_bucket_acl ( "test-bucket-123" , "private" )

Below examples assume that you have some kind of authentication in place but cURL requests to a Rails app do not include it for simplicity.

Now let’s see a sample Rails implementation:

# config/routes.rb resources :files , only: [ :show ], param: :filename do put :upload , on: :collection end # app/controllers/files_controller.rb class FilesController < ApplicationController before_action :authenticate! BUCKET_NAME = "test-bucket-123" def upload client . put_object ( BUCKET_NAME , params . fetch ( :filename ), request . body . read ) head 201 end def show file = client . get_object ( BUCKET_NAME , params . fetch ( :filename ) ) send_data file . body , type: file . headers [ "Content-Type" ], filename: params . fetch ( :filename ) end private def client @client ||= Fog :: Storage :: AWS . new ( ... ) end end

To interact with this controller you can use the following cURL commands:

curl -v --upload-file ./test_file --url http://localhost:3000/files/upload \? filename \= test_file curl -v --url http://localhost:3000/files/test_file

Objects uploaded this way will be private and readable only by authenticated clients. You can double check it:

client . get_object_acl ( "test-bucket-123" , "test_file" )

This solution is secure. AWS credentials are not exposed to the client side and only authenticated users can interact with bucket resources. In real life application, you would probably want to scope the bucket access on per user or group basis depending on your business logic.

Proxied file transfers issue

Although this solution is secure there are some serious performance related problems with proxying both download an upload process through Ruby servers.

Imagine that multiple mobile clients with a slow internet connection start uploading or downloading large files from your servers. It would mean that your main application process is blocked for all the other requests. It would be like running database queries which take a dozen of seconds and could easily bring your whole application down. What’s worse you would have to pay for all the file transfer bandwidth twice.

Luckily AWS S3 offers a neat solution to this problem.

Secure and direct uploads

Now we know how to upload assets directly from clients in an insecure way and how to do it securely through our servers sacrificing performance.

Let’s combine the best of both.

Direct S3 file uploads with presigned URLs

We will use so-called pre-signed URLs in this example. It is a way to generate URLs, valid only for a short amount of time which you can use to access assets on private buckets.

Here’s how they look in action:

# config/routes.rb resources :files , only: [ :show , :new ], param: :filename # app/controllers/files_controller.rb class FilesController < ApplicationController before_action :authenticate! BUCKET_NAME = "test-bucket-123" TIME_TO_ACCESS = 30 . seconds def new upload_url = client . put_object_url ( BUCKET_NAME , params . fetch ( :filename ), TIME_TO_ACCESS . from_now . to_i ) render json: { upload_url: upload_url } end def show download_url = client . get_object_url ( BUCKET_NAME , params . fetch ( :filename ), TIME_TO_ACCESS . from_now . to_i ) render json: { download_url: download_url } end private def client @client ||= Fog :: Storage :: AWS . new ( ... ) end end

To upload a file you have to do the following:

curl -v http://localhost:3000/files/new \? filename \= test_file

It will return an URL which you can use to upload a file with a given filename:

curl -v --upload-file ./test-file --url https://test-bucket-123.s3.amazonaws.com/test_file?X-Amz-Expires = -1524413807 &X-Amz-Date = 20180422T161652Z&X-Amz-Algorithm = AWS4-HMAC-SHA256&X-Amz-Credential = AKIAJEVNLXOPYWMLYCXQ/20180422/us-east-1/s3/aws4_request&X-Amz-SignedHeaders = host&X-Amz-Signature = 43c6b674ea6f91873593e3e016612e20359648fc68ae1e1943dc4499212e3323

What’s important is that this upload URL will only be valid for a specified TIME_TO_ACCESS period. Client-side must be programmed to access the URL right after receiving it from the server.

Similarly, you can retrieve a download URL from a show action:

curl -v http://localhost:3000/files/test_file

and use it to download the desired asset directly from a bucket:

curl -v https://test-bucket-123.s3.amazonaws.com/test_file?X-Amz-Expires = -1524413827 &X-Amz-Date = 20180422T161712Z&X-Amz-Algorithm = AWS4-HMAC-SHA256&X-Amz-Credential = AKIAJEVNLXOPYWMLYCXQ/20180422/us-east-1/s3/aws4_request&X-Amz-SignedHeaders = host&X-Amz-Signature = 6f56782d7f89d3019c90946bcc6e2b150fe491df364b96dff46366e33cf5ed72

[WARNING] remember to parse the JSON from raw cURL command result, otherwise, the URL will not validate correctly with AWS API.

Advantages of pre-signed URLs

This solution is optimal in terms of security and performance. Files are not transferred through our servers so bandwidth and even the slowest mobile clients are not a problem.

Security credentials never leave the server, our bucket is private and URLs we share are only issued to authenticated clients and valid for a short period of time.

It is essential to make the validity period as short as possible. Imagine that your app’s user logged into an account on someone else’s computer and accessed his S3 assets there. If your signed URLs will be valid longer they could be retrieved from the browser’s history even if your user logged out.

The same if he accessed your app via a compromised network. Middleman could obtain his access token and asset links but it would not be a serious security threat. Token gets invalidated after sign out and links no longer grant access to the bucket resources because they expired.

Summary

With the high-level APIs provided by popular S3 gems, it is very easy to overlook some detail and make a mistake which could compromise the security or kill the performance of your app. Getting to know how AWS S3 works on a lower level will allow you to make more thoughtful decisions when designing your app’s infrastructure.