Easy enough so far.

Now go here to get the source code for Brain.js. Copy & paste the whole thing into your empty brain.js file, hit save and bam: 2 out of 4 files are finished.

2 — “What is my purpose?”

Next is the fun part: deciding what your machine will learn. There are countless practical problems that you can solve with something like this; sentiment analysis or image classification for example. I happen to think that applications of M.L. that process text as input are particularly interesting because you can find training data virtually everywhere and they have a huge variety of potential use cases, so the example that we’ll be using here will be one that deals with classifying text:

We’ll be determining whether a tweet was written by Donald Trump or Kim Kardashian.

Ok, so this might not be the most useful application. But Twitter is a treasure trove of machine learning fodder and, useless though it may be, our tweet-author-identifier will nevertheless illustrate a pretty powerful point. Once it’s been trained, our neural network will be able to look at a tweet that it has never seen before and then be able to determine whether it was written by Donald Trump or by Kim Kardashian just by recognizing patterns in the things they write. In order to do that, we’ll need to feed it as much training data as we can bear to copy / paste into our training-data.js file and then we can see if we can identify ourselves some tweet authors.

3 — Setup & Data-handling

Now all that’s left to do is set up Brain.js in our scripts.js file and feed it some training data in our training-data.js file. But before we do any of that, let’s start with a 30,000-foot view of how all of this will work.

Setting up Brain.js is extremely easy so we won’t spend too much time on that but there are a few details about how it’s going to expect its input data to be formatted that we should go over first. Let’s start by looking at the setup example that’s included in the documentation (which I’ve slightly modified here) that illustrates all this pretty well:

First of all, the example above is actually a working A.I (it looks at a given color and tells you whether black text or white text would be more legible on it). Which hopefully illustrates how easy Brain.js is to use. Just instantiate it, train it, and run it. That’s it. I mean, if you inlined the training data that would be 3 lines of code. Pretty cool.

Now let’s talk about training data for a minute. There are two important things to note in the above example other than the overall input: {}, output: {} format of the training data.

First, the data do not need to be all the same length. As you can see on line 11 above, only an R and a B value get passed whereas the other two inputs pass an R, G, and B value. Also, even though the example above shows the input as objects, it’s worth mentioning that you could also use arrays. I mention this largely because we’ll be passing arrays of varying length in our project.

Second, those are not valid RGB values. Every one of them would come out as black if you were to actually use it. That’s because input values have to be between 0 and 1 in order for Brain.js to work with them. So, in the above example, each color had to be processed (probably just fed through a function that divides it by 255 — the max value for RGB) in order to make it work. And we’ll be doing the same thing.

3.1 — encode()

So if we want out neural network to accept tweets (i.e. strings) as an input, we’ll need to run them through an similar function (called encode() below) that will turn every character in a string into a value between 0 and 1 and store it in an array. Fortunately, Javascript has a native method for converting any character into ASCII code called charCodeAt() . So we’ll use that and divide the outcome by the max value for Extended ASCII characters: 255 (we’re using extended ASCII just in case we encounter any fringe cases like é or ½) which will ensure that we get a value <1.

3.2 — processTrainingData()

Also, we’ll be storing our training data as plain text, not as the encoded data that we’ll ultimately be feeding into our A.I. - you’ll thank me for this later. So we’ll need another function (called processTrainingData() below) that will apply the previously mentioned encoding function to our training data, selectively converting the text into encoded characters, and returning an array of training data that will play nicely with Brain.js

So here’s what all of that code will look like (this goes into your ‘scripts.js’ file):

31 lines. That’s pretty much it.

Something that you’ll notice here that wasn’t present in the example from the documentation shown earlier (other than the two helper functions that we’ve already gone over) is on line 20 in the train() function, which saves the trained neural network to a global variable called trainedNet . This prevents us from having to re-train our neural network every time we use it. Once the network is trained and saved to the variable, we can just call it like a function and pass in our encoded input (as shown on line 25 in the execute() function) to use our A.I.

Alright, so now your index.html, brain.js, and scripts.js files are finished. Now all we need is to put something into training-data.js and we’ll be ready to go.

4 — Train

Last but not least, our training data. Like I mentioned, we’re storing all our tweets as text and encoding them into numeric values on the fly, which will make your life a whole lot easier when you actually need to copy / paste training data. No formatting necessary. Just paste in the text and add a new row.

Add that to your ‘training-data.js’ file and you’re done!

Note: although the above example only shows 3 samples from each person, I used 10 of each; I just didn’t want this sample to take up too much space. Of course, your neural network’s accuracy will increase proportionally to the amount of training data that you give it, so feel free to use more or less than me and see how it affects your outcomes

5 — Execute

Now, to run your newly-trained neural network just throw an extra line at the bottom of your ‘script.js’ file that calls the execute() function and passes in a tweet from Trump or Kardashian; make sure to console.log it because we haven’t built a UI. Here’s a tweet from Kim Kardashian that was not in my training data (i.e. the network has never encountered this tweet before):

console.log(execute("These aren't real. Kanye would never write Yeezy on the side"));

Then pull up your index.html page on localhost, check the console, aaand…

There it is! The network correctly identified a tweet that it had never seen before as originating from Kim Kardashian, with a certainty of 86%.

Now let’s try it again with a Trump tweet:

console.log(execute("Whether we are Republican or Democrat, we must now focus on strengthening Background Checks!"));

And the result…

Again, a never-before-seen tweet. And again, correctly identified! This time with 97% certainty.

6 — Profit?

Now you have a neural network that can be trained on any text that you want! You could easily adapt this to identify the sentiment of an email or your company’s online reviews, identify spam, classify blog posts, determine whether a message is urgent or not, or any of a thousand different applications. And as useless as our tweet identifier is, it still illustrates a really interesting point: that a neural network like this can perform tasks as nuanced as identifying someone based on the way they write.

So even if you don’t go out and create an innovative or useful tool that’s powered by machine learning, this is still a great bit of experience to have in your developer tool belt. You never know when it might come in handy or even open up new opportunities down the road.

Once again, all of this is available in a GitHub repo here: