*This is the first article in the series of posts about Security in NodeJS. Next time I will release ‘Security in Deployment’ and then final checklist for both stages.

Where it comes from or TLTR;

The question of web security or web insecurity can arise daily when we interact with applications. Sometimes that’s not an issue of ‘hack’, but a system stability and ‘stress resistance’.

On August 6, 1991, Tim Berners-Lee has delivered info.cern.cz, a simple static website without user interface and dynamic data. Since that time almost 30 years passed.

This is not something you should worry about in terms of security because by ‘hacking’ this kind of websites you can only mean corrupting their static data (replacing or deleting pages, inserting internal content etc.) However, there is nothing that could harm computers or reveal users personal data.

One step closer and you could download a malicious file that could be executed on your computer. It is much more harmful, but still not that dangerous because you were responsible for file accepting process. At least the process was under control.

Now, when we have gone through a lot of chain links, modern applications become huge monsters. Complex systems with personal interfaces, stored data, and high-level functionality. From a little pile of information websites, we’ve come to a massive dump of online stores, social networks, bank services, online auction systems, mailing platforms etc.

Today users can interact with any piece of transmitted data within an application, submit any parameters, use tools to access applications and send requests. Consequently, a lot of actions can be performed by attackers such as modifying provided parameters, sniffing access tokens, replacing use input to be processed by a system etc.

Since our life has been moved to the web, we’re responsible for everything we do there. We’re responsible for all the taken actions when we use applications, as well as when we develop and deliver them.

Security problems and attack types

Okay, now it’s secure enough © noone ever

That’s sad but true. Nothing can be called safe enough and there is always a way to break you out. Here is a short list of methods to jest at your system security:

Broken authentication

Broken access controls

Database injection

Cross-site scripting

Information leakage

Cross-site request forgery

You can find out more about web security reading a book by Marcus Pinto “The web Application Hacker’s Handbook”.

A small pre-action

Let’s talk about the most frequent security attack approaches and solutions that we can follow to decrease a risk of being hacked.

Good developers code; Best developers copy © stack overflow

Just before we go deeper I’d like you to think about the fact that we live at a time when almost everything can be covered by already existing libraries and code snippets. You can even install a package to check if the incoming number is not thirteen (https://github.com/haggy/is-not-thirteen).

In this case, we don’t need to reinvent a wheel, it’s enough to download a suitable npm or yarn package and that’s it. But let’s take into account that each newly installed package may include many other libraries and they may require more. It turns out that a regular application code can be presented in the following chart.

In this particular case, we can’t trust packages that we install and especially we can’t ensure that these packages won’t bring anything malicious to your server.

Okay, it’s time to get down to business and talk about security problems that can be produced externally or by the installed packages.

And action! Problem — Solution

Brute-force attack

The most valid but time-consuming method to crack a password

Imagine that you have a machine that can insert 400.000.000.000 passwords per second (a real capacity). In this case, to break an 8-character password that includes 96 possible characters will take less than 6 hours. Sounds dangerous. Here is a list of actions you can take in order to protect an application from the case:

Require strong passwords

According to OWASP, passwords with less than 10 characters are considered to be weak. Their complexity should also be taken into account.

Implement rate limiting flow

Rate limiting will help you keep an appropriate level of possible authentication attempts. You can play around with the value (limit), but the main idea is to prevent tons of simulation requests from a single IP address or, what is more reasonable, from a single email address.

Use invalid login attempts locker

This rule is made possible after the previous one. A good practice is blocking a source of invalid authentication attempts.

You can set a block for a particular period of time or request external verification. One of the most common approaches is the use of email or SMS verification.

Setup logging

Setup up a logging layer, thus you could always detect anomalies like frequent authentication activity for a particular account. At least you’ll be notified and will be ready to take actions.

Follow secure password recover approach

Requiring the old password

Asking secret question is a good approach

Ensuring that the forgotten password process doesn’t reveal current authentication details

Useful Tools

node-rate-limiter

Database injections

DB injection has been around for almost 20 years and is still a big issue

Event if it is a well-known issue for such a long time, the world isn’t safe from these attacks. Below you will find a small checklist:

Validate incoming body and query parameters

Back-end validation is a must-have in the whole data handling process. The system should perform a check of the incoming body, query parameters, validate headers in order to prevent the whole ecosystem from a crash.

Note that it’s highly important to check not only the data you’re waiting for but also to exclude fields that shouldn’t be included in a request.

For example, imagine that you have an entity in your DB with the following fields:

The ‘deleted’ field comes from the soft-delete flow when we don’t delete entities from a database, but mark as ‘deleted’. In this case, we have a special endpoint to perform the deletion, but what if we use ‘update’ endpoint to switch the ‘deleted’ field from ‘false’ to ‘true’. This is logically not correct and will break our app.

In order to prevent the problem, we need to accept only those fields that can be processed and the rest of incoming data should be omitted.

Be careful with the executable incoming data

Some passed data to be processed by your database can be recognized as an internal command and be executed by a database engine. For example, if you use MongoDB as a storage, you can provide ‘{ $exists: true }’ as user ID for a user deletion endpoint. In the face of weak ACL level, the system will probably remove all users from your database.

Explanation:

We have a route to perform user deletion by ID:

When we pass a user ID like ‘/users/12345’, the system will run a query to delete this user

In our case it’s

But if we pass { $exists: true } as user ID, the executed query will look like

which in MongoDB syntax means — remove all users with _id.

That’s why you need to validate all the data that comes to your server.

Use ORM/ODM for database queries

Using 3rd party libraries is always a deviation from security, but sometimes it’s a good choice to prevent bigger problems. I recommend using DB ORM/ODM packages because they usually have a rich toolkit for data validation, sanitization, and mapping.

Useful Tools

validator, express-validator, joi

Command Injection

NodeJS allows you to execute JavaScript and Shell Commands directly within the system using core API. Be careful with the data that comes from the client if it’s about to be executed in the system.

As an example, we may describe the situation when you’re are waiting for a file name to execute a command like exec(`touch ${fileName}`). The command will take a name (fileName) and create a file with the provided name. But I can give a command to remove everything from the folder and it will be processed by the system, by sending ‘test.txt && rm -rf .*’ as a file name.

That’s why you have to be careful with everything that comes from a client. You have to validate all the pieces of data and validate the incoming parameters.

Useful Tools

validator, express-validator, joi

Regular expressions DoS

ReDOS implementations may reach extreme situations that cause them to work slowly

Problem itself

This attack becomes a huge problem in case of inappropriate regular expressions’ usage. The state machine and NFA give us a good set of tools, but at the same time, they may cause problems.

I’ll try to explain the issue by showing a popular example.

Here is a simple regular expression ‘^(a+)+$’ which means that we’re searching for something that starts with ‘a’-group and this group can be repeated as many times as we need.

But what does this ‘a’-group mean? Let’s say that we have the ‘aaaa’ string. Can we group all the ‘a’ letters just by a single letter in one group? — Yes, we can. And the result will be ‘(a)(a)(a)(a)’. Can we group them by 2 letters? And the answer is also ‘yes’. We will have ‘(aa)(aa)’. And guess what? — We can also do this ‘(aaaa)’ — to have just one group.

And what does the state machine do? The thing is that actually the system is set to find a match in a string. So it will search for matching elements while there is something to iterate through and while there are variants of combining elements.

In the context of order, the machine will match the string on the first try, because we only have ‘a’ letters in the string. But what if we have something that doesn’t match the regular expression? Yes, it will perform as many tries, as it possible to match a regular expression.

And if we have the ‘aaaaX’ string, the system will perform 16 tries to match the regular expression (‘^(a+)+$’).

(a)(a)(a)(a)X

(a)(a)(aa)X

(a)(aa)(a)X

(a)(aaa)X

(aa)(aa)X

(aaa)(a)X

(aaaa)X

What if we have a longer string? Then we’ll have more tries and due to the fact that these checks will be handled in a synchronous process, other operations will wait until it’s finished.

A good example is shown here.

More about ReDoS is here.

Solution

Try to avoid writing own regular expressions when it’s possible and use ready-to-use solutions

Use validation libraries instead of checking data with own regular expressions

Try to avoid ‘evil’ expressions like grouping with repetition, alternation with overlapping inside the group

Useful Tools

safe-regex, regex-dos, validator, express-validator, joi

Memory leaks

Tracking down memory leaks has always been a challenge

I won’t talk too much about memory leaks because it is a huge area. I will just mention the main list of reasons in our case:

forgotten timers or callbacks

weak closures

insecure dependencies

buggy technology ex.new Buffer(size) (deprecated, but still)

All these points may lead to the application fail. There are no strict rules to prevent us from memory leaks because as it has been mentioned, we use too many third-party packages and this is not the only reason.

However, we still can track memory leaks with these tools:

heapdump, memwatch, node-inspector

Hijacking the require chain

Hooking all asynchronous core methods is definitely possible

Modules in NodeJS are handled in the way that if a particular package has already been required somewhere and compiled, next time it will be taken from the cache. Thus there is no need to read a file and compile a module each time it’s required, consequently decreasing the time for getting a module functionality.

More about caching: https://nodejs.org/api/modules.html#modules_caching

As a result, it becomes easier to corrupt a module. But how? Let’s imagine that I have developed a super useful package and in my package, I’ve required a popular library for authentication. Imagine a situation if you also use this ‘authentication’ library in your project, but you have also installed mine. Then, if you require my library before the ‘authentication’ one, the last named will be pulled from cache (seems to be legit).

Read more about the issue…

But what if I modify the authentication module? What if I patch a module and each time when you authenticate a user, I will receive an email with user credentials?

There is no a silver bullet from this, but what I’d recommend is checking modules that you use within your application. Refer to GitHub issues and stars.

There are also 2 tools (snyk, npm audit) for checking the installed modules for possible vulnerabilities. They will go through the package.json file and check if you have modules which may affect your application. Both of them will notify you about weak spots and provide solutions to the issues.

Useful Tools

snyk, npm audit

Rainbow table attack

Rain, Bow and Table attack must be weird enough…

Rainbow table is a predetermined table of <password> — <hash> pairs (ex. for the most common passwords) for reversing cryptographic hash functions. It can be generated according to encryption type/way and existing hash.

I’ll share one simple example of use. Imagine that I have stolen your database, where password hashes and emails are stored. There is no reason to access an account using these pairs because you need a clear password. But what if I have a table of the most popular passwords and even more — with lots of self-generated passwords using personal data (first name, last name, birth date, etc.) from the stolen database? What if I have a tabled of hashed/encrypted values for these passwords using different types of hashing/encryption? I can just go through these values and compare your hash. If I find a matching value, I will use a clean password value to access your account and operate there.

To prevent the above-mentioned issues you should:

don’t use Math.random() to generate a random password

use personal salt

use well known cryptographic modules (crypto)

The first point is related to the fact that in some cases Math.random() values can be predicted. You can refer to this post to get more details.

Information leakage

— “Hello, can you tell me where’s the store? ”

— ” Hello officer, I’m not stoned…” *failed*

This security issue brings us its own surprises and they can be both small and big. A common situation that occurs here is revealing internal application architecture by inappropriate error handling when you provide the whole error stack to a client. This stack may contain file structure, functions details and even libraries that you use, as a result, this information can be used by an attacker. Knowing that you use a library with well-known vulnerabilities will help them to perform malicious actions.

There is a single solution — think. Don’t display full system/syntax errors’ description in the production environment, but log them and show a default pretty error instead. It’s a good practice to throw errors in your controllers, services, modules and catch them in a single point, so you can play with the output depending on a server environment (code example). Don’t provide too much information when you perform access details related actions (password reset).

Not encrypted Connection

Nothing to quote

There is nothing to explain, but just a reminder — implement SSL. Even a free or self-signed SSL certificate will be useful for you. https://letsencrypt.org/ provider with the one.

Don’t forget to configure HTTP to HTTPS redirect using NGINX or directly in NodeJS server.

Best Practices

Limit requests’ frequency (node-rate-limiter)

Use database ORM/ODM (mongoose, sequelize etc.)

Don’t accept or use carefully query parameters as database query items

Validate incoming body, query parameters (validator, express-validator, joi)

Validate incoming body schema

Hide ‘X-Powered-By’ header

Set strong access control system

Require strong passwords

Use password encryption with salt

Use SSL

Take care of regular expressions

Use Security check tools (snyk, npm audit)

Setup CORS rules (cors)

Setup XSS Production (helmet)

OWASP top 10 (http://nodegoat.herokuapp.com/tutorial)

Helpful modules

Protection: helmet, lusca, cruf

Validation: validator, express-validator, joi

Package security check: snyk, npm audit

Rate limiting: node-rate-limiter

Memory check: heapdump, memwatch, node-inspector

Other: cors, safe-regex, regex-dos