If you’ve come from Part 1, you’ll now have a solid understanding of the security measures you can integrate into S3. If you didn’t, it’s recommended to have a read through of that post so you’re on solid footing for the topics discussed in this one.

So you have your S3 buckets locked down, encrypted, with CloudTrail and CloudWatch monitoring any events occurring and a 3rd party or custom Lambdas dealing with suspicious activity. What happens when you actually need to use data stored in S3? Whether this is to upload profile photos or to download documents to be displayed on a mobile app, your applications need access.

This post will discuss four setups, starting with two commonly used but slightly flawed approaches and moving on to two more secure, flexible methods. Examples will be given using diagrams and CloudFormation infrastructure code.

What’s the scenario?

You’re developing a mobile application for sharing photos. Users can upload photos, and can decide whether they want these photos to be completely private, shared with friends or be accessible publicly. You’re obviously hoping this application will catch on and will have a lot of users, driving a lot of traffic.

Solution 1: Downloading / uploading through backend

This first solution is usually the one that people initially go for. Here we can see that all requests relating to S3 go through the backend, in this case a group of autoscaling EC2 instances behind a load balancer. These could be containers, or Lambda functions.

In this situation the API running on the EC2 server would have an IAM policy associated with it giving access to the otherwise private S3 bucket, for example:

Statement: - Effect: Allow Action: - s3:ListBucket - s3:GetBucketLocation Resource: "arn:aws:s3:::bucketname" - Effect: Allow Action: - s3:DeleteObject - s3:GetObject - s3:GetObjectAcl - s3:PutObject - s3:PutObjectAcl Resource: "arn:aws:s3:::bucketname/*"

To upload a file, the mobile app makes an API request to the backend containing the file contents. The backend then uploads the file to S3.

To download a file, the mobile app makes an API request to the backend, and the backend downloads the file from S3 and serves it to the mobile app.

Pros:

Relatively simple setup and configuration

The backend API can be in charge of complicated access rules determining who has access to which photos

AWS API keys / IAM roles are stored only on the backend

Cons:

Extra stress on your API machines. The API machines have to handle receiving potentially large files from the mobile app, and storing them in RAM or on disk when shuttling them back and forth.

Increased costs. More stress means more machines are needed to be scaled up. You also incur costs on the load balancer as you pay per GB of traffic.

Managing temporary files on API machines. Potentially sensitive data will be stored temporarily on the API machines and you will have to write code to manage this.

Not scalable. For all the reasons above, this is not a scalable approach.

Verdict

For a small application this may get you moving but you’ll be hitting roadblocks and hefty bills in no time.

Solution 2: Direct access via front end with Cognito

This next solution gets around the scaling issues by using Cognito Federated Identities to get temporary AWS credentials that the front end mobile app can use to interact with S3 directly. This particular example uses Developer Authenticated Identities (where our own backend is the Identity Provider), but the principles are the same for the External Provider auth flow (authenticating via Facebook, Google etc.)

The functionality would work as such:

Mobile application has already authorised with backend (whether it is EC2, ECS, API Gateway etc.). It then requests a Cognito identity and token from the backend. The backend carries out its usual authorisation of the front end request. It then makes a request to Cognito to get a Cognito Identity Id and a temporary OpenId token for the user associated with the mobile application (or a “guest”). The backend needs the necessary IAM permissions to make this request to the Cognito Identity Pool. The backend then relays the response back to the mobile app. The mobile app can then request temporary AWS credentials from Cognito using the Identity Id and OpenId token. Tied to the Identity Id and token is an IAM role, which contains the permissions that the front end will receive. Cognito validates the parameters, and communicates with AWS STS (Security Token Service) to get temporary credentials, which Cognito returns to the mobile app. If the role attached to Cognito was set up correctly, then the mobile app can use the temporary credentials to access S3.

Below are snippets of CloudFormation code covering the Cognito set up and snippets of IAM.

IAM Permissions for backend to get Open ID tokens on behalf of mobile app users

Statement: - Effect: Allow Action: - cognito-identity:GetOpenIdTokenForDeveloperIdentity - cognito-identity:LookupDeveloperIdentity - cognito-identity:MergeDeveloperIdentities - cognito-identity:UnlinkDeveloperIdentity Resource: !Sub "arn:aws:cognito-identity:${AWS::Region}:${AWS::AccountId}:identitypool/${IdentityPoolId}"

Cognito setup, excluding unauthenticated roles for brevity

CognitoIdentityPool: Type: AWS::Cognito::IdentityPool Properties: DeveloperProviderName: "example.app" CognitoAuthorizedRole: Type: AWS::IAM::Role Properties: AssumeRolePolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Principal: Federated: cognito-identity.amazonaws.com Action: - sts:AssumeRoleWithWebIdentity Condition: StringEquals: cognito-identity.amazonaws.com:aud: !Ref CognitoIdentityPool ForAnyValue:StringLike: cognito-identity.amazonaws.com:amr: authenticated Policies: - PolicyName: !Sub "cognito-auth" PolicyDocument: Version: "2012-10-17" Statement: - Effect: Allow Action: - mobileanalytics:PutEvents - cognito-identity:GetCredentialsForIdentity Resource: "*" - Effect: Allow Action: - s3:GetObject - s3:PutObject Resource: !Sub "arn:aws:s3:::${S3Bucket}/*" IdentityPoolRoleMapping: Type: "AWS::Cognito::IdentityPoolRoleAttachment" Properties: IdentityPoolId: !Ref CognitoIdentityPool Roles: authenticated: !GetAtt CognitoAuthorizedRole.Arn

Pros:

Greatly reduces load on API servers. API servers only need a small Cognito endpoint for mobile apps to retrieve initial Identity Id & OpenId Token, and to refresh when needed.

API servers may no longer need access to S3

Reduced costs as you won’t be spinning up unnecessary API servers

Cons:

Added integration & complexity for front end.

Back end and front end need to carefully manage the handling and storing of the Identity Ids and OpenId Tokens. If they are leaked, an attacker gets the same permissions as an authenticated user.

Developers must be responsible for the security of the Cognito API endpoint used in steps 1 & 2.

Little flexibility in terms of role-based & resource-based access control. Sure, only authenticated users can now retrieve objects from S3, but there isn’t any way to define fine grained access to the objects (for example based on whether the photos were shared with a user or not).

Verdict

For simple scenarios where you want all of your users to have the same access to the same resources in S3, this will work in a scalable fashion. However, when you want to specifically control user access to objects based on relationships between users, roles and resources this will fall short. One benefit of this method is that it can be used to provide access to any AWS service for your front end.

Solution 3: Mediated access by backend with S3 presigned URLs

This solution gets around the inadequacies of the previous solution relating to fine-grained access control. It also enables secure S3 access in a relatively scalable fashion, at low costs by leveraging S3 presigned URLs.

The solution works as follows:

The frontend makes a request to the backend API, for example requesting a list of photos in a particular album. The backend can perform all custom logic in determining whether the mobile app user should have access to particular photos. The backend then creates S3 presigned URLs rather than linking directly to the photos or sending back the file contents. The frontend is then able to use the time-limited presigned URLs to perform the required actions on the specific objects.

In order to upload files, the backend would need to provide an API endpoint that the front end could use to get pre-signed URLs allowing for the uploading of specific files (including necessary checks to prevent the front end overwriting files of other users etc.). Only the backend would need the IAM permissions to PutObject, GetObject etc.

Pros:

Able to perform fully customisable, fine grained access controls in the backend

Front end does not need to manage any AWS credentials or integrate with any AWS SDKs

Backend doesn’t have unnecessary stress of handling file uploads/downloads - generating presigned URLs is a lightweight operation

IAM permissions are relatively simple to implement for backend

Cons:

Backend endpoints for providing signed URLs for upload/deletion functionality must be carefully managed to prevent unintended operations being possible (rewriting files, deleting files not owned by user)

Directly retrieving files from S3 can get costly at high traffic. If your platform becomes popular, or photos become viral you can be hit with a large AWS bill.

Verdict

This approach provides a relatively simple and secure method of providing access to S3, with the backend API having full control over the type of access granted. Care must be taken on the backend API endpoints however. For low to medium traffic, this approach will take you far.

Solution 4: Handling high traffic with CloudFront

This solution builds upon the previous one by adding in CloudFront as a CDN to act as a caching layer for object retrieval to improve speeds for end users and lower costs. Uploading / deleting objects can still be performed via S3 presigned URLs, but retrieving private content is now performed via CloudFront signed URLs.

The solution works as follows:

As before, the front end makes requests to get content, or upload/delete content. For retrievals the backend provides signed CloudFront URLs, whereas for upload/delete S3 presigned URLs are used. Retrieval requests will be directed to a nearby CloudFront edge location. If the object has already been cached here, then a cached version is returned (of course providing the signed URL is valid). If not, the request goes to CloudFront which will retrieve the corresponding object from the S3 bucket. Upload/delete requests go directly to S3 as before.

Snippets of some of the relevant CloudFormation code for this setup are included below:

# A pseudo-user given permission to access S3 on behalf of CloudFront CloudFrontOriginAccessIdentity: Type: AWS::CloudFront::CloudFrontOriginAccessIdentity S3Bucket: Type: AWS::S3::Bucket # Backend API needs an IAM statement like below for upload/delete Effect: Allow Action: - s3:PutObject - s3:DeleteObject Resource: !Sub "arn:aws:s3:::${S3Bucket}/*" S3BucketPolicy: Type: AWS::S3::BucketPolicy Properties: Bucket: !Ref S3Bucket PolicyDocument: Statement: - Action: - s3:GetObject Effect: Allow Principal: CanonicalUser: !GetAtt CloudFrontOriginAccessIdentity.S3CanonicalUserId Resource: !Sub "arn:aws:s3:::${S3Bucket}/*" CloudFrontDistribution: Type: "AWS::CloudFront::Distribution" Properties: DistributionConfig: DefaultCacheBehavior: AllowedMethods: ["GET", "HEAD", "OPTIONS"] Compress: true ForwardedValues: QueryString: true Cookies: Forward: none TargetOriginId: S3Origin TrustedSigners: - !Ref AWS::AccountId Enabled: true Origins: - DomainName: !GetAtt S3Bucket.DomainName Id: S3Origin S3OriginConfig: OriginAccessIdentity: !Sub "origin-access-identity/cloudfront/${CloudFrontOriginAccessIdentity}"

The gist of the above setup is that there is an S3 bucket, that only allows GetObject requests via a CloudFront Origin Access Identity and Put/DeleteObject requests via the backend API, which would come in the form of S3 presigned URLs. The backend is responsible for creating the CloudFront & S3 signed URLs. To improve latency for upload operations, S3 Transfer Acceleration could be used.

Pros:

Backend retains control of managing fine-grained access permissions for objects in S3

Frontend doesn’t need to manage any AWS access credentials or integrate with any AWS SDKs

Backend doesn’t have unnecessary stress of handling file download/uploads.

Extremely scalable whilst keeping costs low due to CloudFront CDN caching lowering S3 requests and data transfer

Low latency access for end users

CloudFront access can be logged and monitored

Cons:

Interactions between CloudFront, S3 and backend API can be difficult to understand at first.

Backend is responsible for security of endpoints that provide signed URLs

Verdict

This approach builds on the previous solution, adding further scalability and lower costs by using CloudFront for object retrieval from S3. Care must still be taken on the backend to adequately secure endpoints relating to signed URL distribution.

Wrapping up

We’ve covered four different solutions for providing applications secure access to S3 resources, along with some CloudFormation code to get you started if you choose to implement any of these architectures yourself. As you can see, there are slight tradeoffs between the approaches and as always you will need to keep the nuances of your use case in mind.

We’re always looking to improve the security posture of the solutions we’re building, so we would love to hear any suggestions from you on alternate approaches or further security measures!

If you think we've missed anything or if you've got some security tips of your own, leave us a comment below. Alternatively you can tweet us on @hedgehoglab.