Machine learning and AI can be pretty complex subjects to understand, but Brain.js makes things a bit simpler. Let’s have a crack at writing a small program that will tell us if a phrase is written in a positive or a negative way.

Note: Full example is available on Github link is at the end of this post. Note two: This is a simplified example, this is by no means the best way to perform sentiment analysis. However, this does explain some of the key concepts involved with data classification.

What is Brain.js?

Brain.js is an amazing library that exposes simple to use Neural Networks for use in Javascript projects. I highly recommend checking out their documentation on Github and see what you can come up with.

Natural Language Processing.

To build something that can identify the following phrase as positive or negative we are going to need some form of Natural Language Processing. For this example, we are going to implement a basic classifier that will allow us to do just that.

“I love lasagna, it makes me happy with it’s cheesy goodness.”

There are a number of steps to classifying a phrase like this, we will go through each of these and look at how it could be implemented with Javascript.

A training dataset.

Before we start writing some code we need to build a base training set. Essentially what we are going to need is a bunch of phrases that we have already classified. The more data we have the more accurate our system will be, however, for something simple we can get away with the following:

Building Our Word Dictionary

The first thing we will want to do is build our “word dictionary” essentially this is going to be a unique normalised array of all the words from our training set. Each of these words will be run through a word stemming algorithm:

A stemmer for English operating on the stem cat should identify such strings as cats, catlike, and catty. A stemming algorithm might also reduce the words fishing, fished, and fisher to the stem fish. The stem need not be a word, however. In the Porter algorithm, argue, argued, argues, arguing, and argus reduce to the stem argue. — Wikipedia https://en.wikipedia.org/wiki/Stemming#Examples

Thankfully some people, much more clever people than myself, (NodeNatural) have created a bunch of these for us in their Natural library.

To do this in Javascript we can write a simple function that takes in our training phrases (see the test data above) and returns back the single dictionary array.

Encoding our data set.

We now have a simplified representation of our training data. What we need to do next is to turn our data into something that is modelable. We do this by creating an encoding function that will convert our phrases into binary strings. Hopefully, all of that sounds simple (If not don’t worry I was confused at this point too).

The easiest way I found to imagine this step is to create a matrix, we can then plot our data on to this matrix which will result in our binary strings. For example, if we had a smaller training set with the phrases:

"Hello World",

"Hello my name is tom"

Our resulting matrix would look like this, essentially we plot each token along the top and place a 1 or a 0 under it if the phrase contains it:

"hello" "world" "my" "name" "is" "tom"

"Hello World" 1 1 0 0 0 0

"Hello my name is tom" 1 0 1 1 1 1

Doing this in Javascript is actually pretty simple now that we have created our dictionary. We can simply loop over each of our training phrases and check if each of the words exists in the dictionary:

For example:

const dictionary = ['hello', 'world', 'my', 'name', 'is', 'tom']

console.log(encode('hello tom')) // [1, 0, 0, 0, 0, 1]

Create and Train a Neural Network.

Now we have a way to build our encoded strings we can being to create our Neural Network. This is where we can harness the power of Brain.js, it takes all of these complexities away from us and exposes a simple to use API:

Yup, that’s really all it takes. We have now trained a Neural Network with our training data and it’s now ready to be tested.

Testing Our Network.

The best way to test this is to take another phrase with a known result for example:

Im so happy to have cake // good: 1

We can then just run this through the encoding process and then run it through our network:

const encoded = encode("Im so happy to have cake")

console.log(network.run(encoded))

// { good: 0.8156641125679016, bad: 0.17976993322372437 }

The result is a probability of how confident our Neural Network is on the result.

And that's it!

We have a working neural network that is able to identify positive and negative phrases.

Now, of course, this is a very simplified solution to a difficult problem space. But the idea here is to show how simple Brain.js makes common classification problems. It also demonstrates the encoding process, you can imagine taking other complex data sources, images, videos etc.. running them through an encoding function and following a similar process to classify them correctly.

I have put the full example on GitHub feel free to explore the code and raise any questions you may have as issues :)