16 December 2016

Securing a Serverless Application

Today I'd like to share with you how I set up Dadario's Learning Academy, a.k.a. my online courses, and how I've secured it. The peculiarity in this case is the fact of this application being a serverless application.

What is a Serverless application?

From the name server-less, without servers, but wait. It's an application that require server(s), but you don't manage them. You delegate the responsibility to third parties.

Perhaps it's a new term for an old concept, but why does it have gained so much attention now? Blame AWS Lambda. Lambda is a product from Amazon Web Services (AWS) that allows you to write code inside functions, get a URL for it using another product of them named AWS API Gateway and pay for the execution time of such function. But the price is ridicuously low. However there are competitors such as Google Cloud Functions, IBM Bluemix OpenWhisk, Azure Functions.

And it turns out that developing my learning academy on top of such concept seemed like a good idea.

Requirements

As any another application, it has requirements, although not very formal:

Web Interface : all access happen in dadario.com.br;

: all access happen in dadario.com.br; High Availability : students should experience the fastest page load and video streaming ever;

: students should experience the fastest page load and video streaming ever; Authentication : students must authenticate before accessing videos, in order for me to be able to communicate with them and when it evolves, to personalize their experience;

: students must authenticate before accessing videos, in order for me to be able to communicate with them and when it evolves, to personalize their experience; Authorization : in the future there may be paid courses, thus there is a need to authorize the access for each course;

: in the future there may be paid courses, thus there is a need to authorize the access for each course; Payment Processing : for paid courses, this is mandatory. Somehow we have to identify who paid which course and then authorize their account to watch it;

: for paid courses, this is mandatory. Somehow we have to identify who paid which course and then authorize their account to watch it; Secure : students should not bypass payments, easily access video URLs, send data in clear, etc

: students should not bypass payments, easily access video URLs, send data in clear, etc Cost Effective: oh yeah, I'm not a high roller.

Implementation

Web Interface

For a Serverless application, the web interface should be a static website and make use of APIs through JavaScript only. As my blog already uses static pages generated by Jekyll, I extended it to implement my learning academy instead of having two codebases. All content will be served using the existing HTTPS infrastructure, thus all traffic in transit will be secure. Amazon takes care of updating OpenSSL when a new 0day appear, so I don't need to worry.

High Availability To publish the web pages I use AWS Simple Storage Service (S3) and AWS CloudFront, which is a Content Delivery Network (CDN) to reduce latency and enhance the user experience. As CloudFront works great to enhance the performance of static files, it works good with videos and images as well. Thus I chose them to upload all my videos, but there's a security problem in here by default: all files are public.

It turns out that you can serve private files from CloudFront, but I didn't consider it at that time. Furthermore I can see that there's a significant effort on development to add this security layer and I wanted to finish quickly. Then I used a commonly misused security concept: obscurity. Oh, yes, it's a valid security layer when applied correctly. My point is to use non easy to guess URLs to store the videos AND the video playlist, a file containing the URL of all videos.

Another important argument here to don't spend much time overprotecting videos is that when a video is played in your browser it's pretty much game over because the user has the content on his/her machine already. You can try to use Digital Rights Management (DRM), but I need to research more. My impression is that's a lot of trouble for me, for those who watch and bypass is possible, or should I say innevitable?

Authentication

For that I've discovered Auth0, an Authentication as a Service. Using their JavaScript SDK I was able to offer social logins using Google and Facebook for my students right from the web interface. At the time of writing it's free for up to 7000 registered users and they offer a nice web interface to manage all users who signed up / logged in.

I turned down registration from forms, where users input their name and email, as it takes more time for them to sign up, they'll have one more account to worry and they also love to input fake data.

Authorization

There's another cool thing about Auth0. It's possible to attach metadata to each account. That's what was missing to authorize students. I just need to store each authorization to watch video in the user metadata. And how does this authorization looks like?

Whenever a student sign in, he/she gets a Json Web Token (JWT) that lives only in his/her browser and the account information containing its metadata. My web application then reads the metadata that contains the video playlist URLs (more on that below) and show the videos.

The secret is to use hard to guess video playlists URLs. If for example a user pay a course, he/she gets a new video playlist URL in their account metadata.

But here's the thing: when a user sign up, by default it has no metadata, right? Right, but we can use a feature called "Rules" from Auth0, where you define a function to be executed after each login. Think of AWS Lambda but for Auth0 only.

Using such arbitrary function I can add the video playlist URLs for all free courses to the student metadata, except when a user logs in using Facebook and don't share the email.

That's a Facebook feature, to let you select what you want to share, but it breaks the requirement to communicate with students, thus without e-mail they shall not pass. In such cases instructions are shown in the screen for the user to be able to authenticate again but sharing his/her email this time.

Payment Processing

Time to make use of PayPal and AWS Lambda/API Gateway.

Using PayPal, two things were needed:

PayPal button for each course that's a button that you can share the link and it will point directly to the checkout, e..g, Course 1 for 10 USD.

Paypal callback URL after each transaction PayPal hits the URL I want it to using what it calls Instant Payment Notification (IPN). The URL I put is the AWS API Gateway generated URL (more on that below).



Using AWS Lambda, I needed to create a function to do the following:

Verify if the request came from PayPal

Verify if the payment succeed

Select the video playlist URL to add to Auth0 user metadata (it means that the user must have authenticated on Auth0 before)

Perform an API call to Auth0 to add the video playlist URL in the metadata

Using AWS API Gateway, I needed to:

Generate a URL for the AWS Lambda function

Update PayPal callback URL to this generated URL

And the flow is actually simple:

Security

As you can see I've discussed security in each topic, but there are a few more highlights:

Serverless or not, code needs code review, thus AWS Lambda code should be reviewed for logic flaws and security bugs.

To perform a penetration test now you need to ask for authorization to each party involved;

As session data remains in the browser, the logout function has to clear the local storage. And it's exactly what it does. The application must be XSS free and as there are no user inputs, there's no XSS either;

Security is still important, of course, but it's changing its form. More responsibilities were delegated and the likelihood to happen a security misconfiguration becomes higher than other threats. The most important things to protect are the accounts on Auth0, PayPal and AWS. Protip: use 2 factor authentication, hard to guess passwords and preferably use a different phone number for 2fa and don't disclose to anyone to avoid telecom hacks :)

Cost-Effective

As I've mentioned AWS Lambda is pretty cost-effective as many of their services, including AWS API Gateway, AWS S3 and AWS CloudFront. What costs me more is data transfer. Perhaps would be interesting to upload videos to servers containing 'unlimited bandwidth', but that's all.

Auth0 has a free account, so I'm very thankful for them. That's why Gauntlet has a free account too, because it opens the door for people to do a lot of awesome things without a penny in their pocket.

PayPal eats a percentage in each transaction, so no problem here either.

Limitations

For now AWS Lambda doesn't support many languages and even when coding in any of the allowed languages there are restrictions regarding what library you can use and so on -- for the sake of security as you can guess. For complex applications it's hard to manage multiple functions, but there are frameworks like serverless.com to help you do the job.

That's all folks

For the sake of time I didn't include more information such as costs and code examples, but feel free to ask me. I'd love to help. Thanks for reading.