The Code and final build

You can try out the final build of the simple linear regression calculator at lettier.com/simple-linear-regression. All of the code for the project is hosted on GitHub.

Who this is for

Haskell programmers looking to solve The JavaScript Problem

Programmers interested in functional programming

JavaScript programmers looking to try out PureScript

Web developers searching for a new approach

Data scientists brushing up on the basics

Machine learning engineers curious about the algorithms they use

What we’ll cover

Using Yarn and NPM scripts

Building Sass

Setting up PureScript

Linear Regression

Gradient Descent

Machine Learning Parameters

PRESS Statistic

Functional Programming

Using the Halogen library to build our UI

Project setup

Before we begin developing, we need to set up our project with all of its files and dependencies.

File structure

simple-linear-regression/ dist/ app.js index.css index.html src/ LinearRegression.purs Main.purs Matrix.purs Plot.js Plot.purs UI.purs Utils.purs static/ html/ index.html scss/ index.scss .nvmrc bower.json package.json

This will be layout for our project.

The .purs extension is short for PureScript. If you have ever used Haskell, the syntax is very similar. Here is a quick guide to get a feel for the language.

PureScript is a small strongly typed programming language that compiles to JavaScript. PureScript.org

Go ahead and run the following commands.

mkdir -p simple-linear-regression cd simple-linear-regression mkdir -p dist src static static/html static/scss touch .nvmrc bower.json package.json cd src touch LinearRegression.purs Main.purs Matrix.purs Plot.js Plot.purs UI.purs Utils.purs cd .. cd static touch html/index.html scss/index.scss cd ..

NVM

We will need to install Node.js using NVM. NVM allows us to easily switch between different versions of Node.

cd ~/Downloads wget -qO- https://raw.githubusercontent.com/creationix/nvm/v0.33.1/install.sh | bash cd ~/simple-linear-regression echo 'v5.5.0' > .nvmrc nvm use

Yarn

To speed up the downloading and installation of our dependencies, will we use Yarn.

cd ~/Downloads curl -o- -L https://yarnpkg.com/install.sh | bash

NPM

Open up your favorite text editor and copy this into the package.json file.

{ "name" : "simple-linear-regression" , "description" : "Simple linear regression calculator." , "homepage" : "http://lettier.com/simple-linear-regression/" , "license" : "Apache-2.0" , "author" : "David Lettier" , "private" : true , "scripts" : { "installPackages" : "yarn && bower install" , "buildSrc" : "pulp build" , "buildDist" : "mkdir -p dist && pulp browserify --to dist/app.js && node-sass static/scss/index.scss dist/index.css && cp -R static/images/. dist/ && cp -R static/html/. dist/" , "watchBuildDist" : "onchange './static/**/*' './src/**/*' -i -- yarn buildDist" } , "dependencies" : { "virtual-dom" : "^2.1.1" , "chartjs-color" : "^2.0.0" , "moment" : "^2.10.6" }, "devDependencies" : { "pulp" : "^10.0.0" , "purescript" : "^0.10.2" , "node-sass" : "4.3.0" , "onchange" : "^3.2.1" } }

With this file we have four NPM scripts we can run with Yarn. watchBuildDist will be really convenient when we start developing as it will build the project into dist/ each time we make a change to any of the project files. Once the project is built, we can just refresh our browser.

Bower

Open up your favorite editor and copy this into the bower.json file.

{ "name" : "simple-linear-regression" , "description" : "Simple linear regression calculator." , "homepage" : "http://lettier.com/simple-linear-regression/" , "license" : "Apache-2.0" , "authors" : [ "David Lettier" ] , "private" : true , "ignore" : [ "**/.*" , "node_modules" , "bower_components" , "output" ] , "dependencies" : { "chart.js" : "2.4.0" , "purescript-prelude" : "^2.1.0" , "purescript-console" : "^2.0.0" , "purescript-foldable-traversable" : "2.0.0" , "purescript-arrays" : "3.1.0" , "purescript-math" : "2.0.0" , "purescript-integers" : "2.1.0" , "purescript-maybe" : "2.0.1" , "purescript-globals" : "2.0.0" , "purescript-halogen" : "*" } , "devDependencies" : { "purescript-psci-support" : "^2.0.0" } }

We can now go ahead and install all of our dependencies so make sure to run the following commands.

cd simple-linear-regression yarn run installPackages

Linear Regression

(inputs[ 0 ][ 0 ], inputs[ 0 ][ 1 ], ... , inputs[ 0 ][ N - 1 ], outputs[ 0 ]) (inputs[ 1 ][ 0 ], inputs[ 1 ][ 1 ], ... , inputs[ 1 ][ N - 1 ], outputs[ 1 ]) ... (inputs[ N - 1 ][ 0 ], inputs[ 0 ][ 1 ], ... , inputs[ 0 ][ N - 1 ], outputs[ N - 1 ])

In simple terms, linear regression is the process of being given some inputs , their corresponding outputs , and finding a linear function out there in the universe that, when given the inputs we were given, hopefully produces outputs the same as or very close to the outputs we were given. If we performed our search correctly, no other linear function comes as close to the one we found. In other words, our goal is to find the best/optimal linear function that describes the relationship between the input and the output.

If a linear function produced the data we were given, our search will be fruitful. The function we find will model the relationship, between the input and output, extremely well—maybe even perfectly. However if some other non-linear process produced the data, our linear-function will be—in some cases—far from the truth.

For example, say we are given the following data.

inputs[ 0 ] : 2.8 , outputs[ 0 ] : 0.3349881501559051 inputs[ 1 ] : 2.9 , outputs[ 1 ] : 0.23924932921398243 inputs[ 2 ] : 3.0 , outputs[ 2 ] : 0.1411200080598672 inputs[ 3 ] : 3.1 , outputs[ 3 ] : 0.04158066243329049 inputs[ 4 ] : 3.141592653589793 , outputs[ 4 ] : 0.0 inputs[ 5 ] : 3.2 , outputs[ 5 ] : - 0.058374143427580086 inputs[ 6 ] : 3.3 , outputs[ 6 ] : - 0.1577456941432482 inputs[ 7 ] : 3.4 , outputs[ 7 ] : - 0.2555411020268312

We search the universe and arrive at the function y = -x + PI . This models the relationship quite well for the points given. In the range [2.8, 3.4] , we can confidently use our model to produce outputs that we were not given. However after seeing the actual function that produced the data, y = sin(x) , we realize how far off our model is.

You can see that for an input such as 1.5 , our function or model of reality 1.6416 = -1.5 + PI is far from the truth 0.975 = sin(1.5) .

Extrapolation is making a prediction outside the range of values of the predictor in the sample used to generate the model. The more removed the prediction is from the range of values used to fit the model, the riskier the prediction becomes because there is no way to check that the relationship continues to be linear. Introduction to Simple Linear Regression by Gerard E. Dallal, Ph.D.

This search for the right function is known as “fitting a model”, where we try to make our model fit the data or observations given. In some respects, it is like searching through stacks of pants to find the perfect fit or interviewing hundreds of candidates for the right culture fit.

Hypothesis Function

So what does the best linear function look like? Fortunately we a have a straight forward “template” which grows with the size of the input or “explanatory variables”—they help explain the output.

predictions[i] = coefficients[ 0 ] * 1 + coefficients[ 1 ] * inputs[i][ 0 ] + ... + coefficients[ N - 1 ] * inputs[i][ N - 1 ] = coefficients[ 0 ] * 1 + sum[ 0 ... j ... ( N - 1 )](coefficients[j + 1 ] * inputs[i][j]) = hypothesis[i]

This is known as the “hypothesis” function.

The inputs[i][j] we know from the data but we don’t know the coefficients[i] . This is what we are specifically searching for—the coefficients that minimize the difference between outputs[i] and predictions[i] for every output given. Ideally for all the outputs[i] , the difference predictions[i] - outputs[i] is always zero.

If your data points are just pairs of x and y , the hypothesis function resembles a very familiar equation.

hypothesis[i] = coefficients[ 0 ] * 1 + coefficients[ 1 ] * inputs[i] = y - intercept * 1 + slope * x[i] = b + m * x[i]

Here the coefficients we need to find are the y-intercept ( b ) and the slope ( m ). You’ll notice that this is the point-slope equation of a line.

Simple versus Multiple

For our interactive calculator, we will be performing simple linear regression where we attempt to model the relationship between one input variable and one output variable. This is opposed to multiple linear regression that models the relationship between two or more input variables (explanatory) and one output variable (the “response”). For example, if your data looks like (x, y, z) , you would use multiple instead of simple linear regression.

While our calculator will only be performing simple, our LinearRegression.purs library/module will be capable of performing multiple.

Cost Function

calculateCost :: Vector -> Matrix -> Vector -> Number calculateCost _ _ [] = 0.0 calculateCost coefficients designMatrix regressands = sum errors / size where size = lengthNum regressands regressands' = map (hypothesis coefficients) designMatrix errors = map (\ ( Tuple y' y) -> pow (y' - y) 2.0 ) (zip regressands' regressands)

Earlier we said that our goal was to find the collection of coefficients that minimize the difference between the outputs given and the predicted outputs (as given by our hypothesis function). More precisely we wish to minimize our cost or loss function.

The cost function takes the coefficients we’ve come up with, the rows of inputs given, the list of outputs given, and returns the mean squared error or MSE.

To calculate the MSE, the cost function calculates the squared error (difference) between each given and predicted output, sums these squared errors (predicted output - given output)^2 , and then divides the sum by the size of the given outputs.

If our hypothesis function is perfect, we will have a MSE of zero.

An MSE of zero, meaning that the estimator predicts observations of the parameter θ with perfect accuracy, is the ideal, but is typically not possible. Mean Squared Error - Wikipedia

Minimizing the cost function is the act of finding the smallest MSE possible. For some problems, a MSE of zero may not be possible.

Notice that we cannot change the inputs ( designMatrix ) or outputs ( regressands ) as those were given to us. The only values we can change are the coefficients . As we change the coefficients, the MSE will either increase, decrease, or stay the same. Ideally we want to keep changing the coefficients such that the MSE continuously decreases until it reaches zero.

Once the MSE never drops below a certain number or it reaches zero, we have reached the end of our search and our model has been fitted.

Gradient Descent

So far we have seen that linear regression is essentially a search problem. Our goal is to find the optimal collection of coefficients that reduce our cost function the most. But how do we find the best coefficients?

We could continuously guess, calculate the cost, and if it is less than the last cost, update the coefficients

We could find a closed-form solution such that no matter the data, we could always find the most optimal coefficients analytically using only a finite number of operations

We could find the gradient of the cost function, start with an initial guess, and then continuously evaluate and move in the opposite direction of the gradient until we have reached some cutoff point or when the gradient is zero

For the calculator we will use the last option which is known as gradient descent.

Gradient Function

Recall that our cost function takes in the coefficients we currently have and returns the mean squared error. This tells us how wrong we are and—in the case of it being zero—when we no longer need to search. But if we must keep searching, it doesn’t tell us where we should search next. This is where the gradient function comes in.

The gradient of our cost function will act like a guide. For every position that we search, the gradient function will output a vector. This vector tells us in what direction, from where we currently are, is the cost function increasing the most and what the slope/steepness/grade of that direction is. Note that to minimize the cost function, we have go in the opposite direction of the gradient.

If we ever reach a minima, maxima, or saddle point, the gradient will not have a direction and its magnitude (or slope of the graph at that point) will be zero. For this case we can either stop our search and accept our current position or we can start our search over with a new guess.

hypothesis[i] = coefficients[ 0 ] * 1 + sum [ 0 ...j...(N - 1 )](coefficients[j + 1 ] * inputs[i][j]) cost = ( 1 / ( 2 * N)) * sum [ 0 ...i...(N - 1 )]((hypothesis[i] - outputs[i]) ^ 2 ) partialDerivative[ 0 ] = ( 1 / N) * sum [ 0 ...j...(N - 1 )]((hypothesis[j] - outputs[j]) * 1 ) partialDerivative[i > 0 ] = ( 1 / N) * sum [ 0 ...j...(N - 1 )]((hypothesis[j] - outputs[j]) * inputs[j][i]) gradient = < partialDerivative[ 0 ], partialDerivative[ 1 ], ..., partialDerivative[N - 1 ] >

Each element i in the gradient (vector) is the partial derivative of the cost function with respect to coefficient i .

partialDerivative[ 0 ] = ( 1 / N ) * sum[ 0 ... j ... ( N - 1 )]((hypothesis[j] - outputs[j]) * 1 )

Remember that we always have one more coefficient than we do the number of elements in a feature vector (the number of columns in our design matrix or inputs). The zero coefficient is always times one and the derivative of coefficient[0] * 1 with respect to coefficient[0] is just 1 —hence the subtle difference for the partialDerivative of coefficient zero.

cost = ( 1 / ( 2 * N )) * sum[ 0 ... i ... ( N - 1 )]((hypothesis[i] - outputs[i]) ^ 2 )

Note that we alter our cost function a bit and add a 2 in (1 / (2 * N)) . This makes for a cleaner partial derivative and will not disrupt our search as the coefficients that minimize the cost function with the added two will also minimize the cost function without the added two.

partialDerivative :: Int -> Vector -> Matrix -> Vector -> Number partialDerivative _ [] _ _ = 0.0 partialDerivative _ _ [] _ = 0.0 partialDerivative _ _ _ [] = 0.0 partialDerivative i coefficients designMatrix regressands = if i < 0 then 0.0 else result where size = lengthNum regressands tuples = zip regressands designMatrix result = ( 1.0 / size) * (sum (map mapper tuples)) mapper :: Tuple Number Vector -> Number mapper ( Tuple _ []) = 0.0 mapper ( Tuple y regressors) = ((hypothesis coefficients regressors) - y) * xj where xj = if i == 0 then 1.0 else (fromMaybe 0.0 (regressors !! (i - 1 )))

The first line is the type signature while the next three are pattern matching for empty input.

result = ( 1.0 / size) * (sum (map mapper tuples)) mapper ( Tuple y regressors) = ((hypothesis coefficients regressors) - y) * xj

In these two lines we calculate the partialDerivative with respect to coefficient i . If we are calculating the partialDerivative for coefficient 0 then xj is one.

paritalDerivativeMapper i = partialDerivative i coefficients designMatrix regressands gradient = map paritalDerivativeMapper (range 0 (length coefficients - 1 ))

Putting it all together, we compute and return the gradient vector.

Learning Rate

The cost function tells us if we should keep searching. The gradient tells us where we should search next. We now visit the learning rate which controls how far into the opposite direction of the gradient we advance.

In terms of the gradient descent algorithm, the learning rate is a throttle we use to control how “fast” we move around our cost function as we search for the global minimum. Setting it too high will cause the search to bounce around, possibly overshooting our target. Setting it too low will bring your search to a crawl, taking hours to converge (the coefficients no longer update).

One strategy is to set the learning rate in the beginning and then keep it constant while running gradient descent. Another strategy is to continuously update it as the search progresses. We will take the latter approach and use a technique called bold driver.

learningRate' = if previousCalculatedCost < currentCalculatedCost then learningRate - (learningRate * 0.5 ) else learningRate + (learningRate * 0.5 )

Here you see our use of bold driver. If the previous cost is less than the current cost, we throttle down. Otherwise, we throttle up.

The final part of gradient descent is the actual updating of the coefficients.

updateCoefficients :: Number -> Vector -> Matrix -> Vector -> Vector updateCoefficients learningRate coefficients designMatrix regressands = result where paritalDerivativeMapper i = partialDerivative i coefficients designMatrix regressands gradient = map paritalDerivativeMapper (range 0 (length coefficients - 1 )) result = map (\ ( Tuple o g) -> o - (learningRate * g)) (zip coefficients gradient)

Using the learning rate and the gradient, we map over the coefficients and return them updated. The update function is the following.

newCoefficient[i] = oldCoefficient[i] - (learningRate * partialDerivative[i])

PRESS Statistic

As with any machine learning endeavor, you’ll want to cross-validate the model you have built. Cross-validation can help detect overfitting which is typically present when your model performs well for observations it trained on but performs poorly on observations it did not train on. For our calculator we will use the PRESS statistic method.

In statistics, the predicted residual error sum of squares (PRESS) statistic is a form of cross-validation used in regression analysis to provide a summary measure of the fit of a model to a sample of observations that were not themselves used to estimate the model. It is calculated as the sums of squares of the prediction residuals for those observations. PRESS statistic - Wikipedia

One simple way to calculate the PRESS statistic is the leave-one-out strategy.

For every input-output pair i Perform linear regression on all pairs except i Using the found coefficients, calculate the squared error (hypothesis[i] - outputs[i])^2

Sum all of the squared errors to obtain the PRESS statistic

Unfortunately this requires performing linear regression as many times as the number of observations you have. We will take another approach and use the projection or hat matrix (it maps the given outputs to the predicted outputs).

The hat matrix enters this discussion because it saves a lot of calculation. One need not actually re-calculate the regression results N different times. Rather, it is true that the PRESS residual is equal to the ordinary residual divided by 1 minus the diagonal of the hat matrix. The Hat Matrix and Regression Diagnostics by Paul Johnson

X = designMatrix or rows of given inputs hatMatrix = X * (transpose( X ) * X ) ^- 1 * transpose( X )

We can perform the needed matrix math using the library from Let’s make a Matrix Inverse Calculator with PureScript.

pressStatistic = sum[ 0 ... i ... ( N - 1 )](((hypothesis[i] - output[i]) / ( 1 - hatMatrix[i][i])) ^ 2 )

Our revised method for computing the PRESS statistic is the following.

For every input-output pair i Calculate errors[i] = (hypothesis[i] - outputs[i]) / (1 - hatMatrix[i][i]) Square each errors[i]

Sum the list of squared errors[i] .

We can now translate this outline into actual code.

predictions :: Vector predictions = map (\ regressors -> hypothesis coefficients regressors) designMatrix

First we calculate the predictions using the fitted or learned coefficients after having performed linear regression using all of the input-output pairs.

residuals :: Vector residuals = map (\ ( Tuple y y') -> y - y') (zip regressands predictions)

Next we calculate all of the residuals or errors between predicted and given.

hatMatrix :: Matrix hatMatrix = fromMaybe [] (calculateHatMatrix designMatrix)

Using the equation from up above, we obtain the hat matrix.

calculatePressStatistic :: Vector -> Vector -> Matrix -> Maybe Number calculatePressStatistic [] _ _ = Nothing calculatePressStatistic _ [] _ = Nothing calculatePressStatistic _ _ [] = Nothing calculatePressStatistic regressands coefficients designMatrix = maybeSummation where -- ... folder :: Maybe Number -> Int -> Maybe Number folder Nothing _ = Nothing folder ( Just acc) i = if isNothing maybeResidual || isNothing maybeTerm || 1.0 - term == 0.0 then Nothing else Just (acc + (pow (residual / ( 1.0 - term)) 2.0 )) where maybeResidual :: Maybe Number maybeResidual = residuals !! i residual :: Number residual = fromMaybe 0.0 maybeResidual row :: Vector row = fromMaybe [] (hatMatrix !! i) maybeTerm :: Maybe Number maybeTerm = row !! i term :: Number term = fromMaybe 0.0 maybeTerm maybeSummation :: Maybe Number maybeSummation = foldl folder ( Just 0.0 ) (range 0 (length regressands - 1 ))

Here we put it all together and return maybeSummation which may be Nothing or Just the PRESS statistic.

LinearRegression.purs

Go ahead now and open up the src/LinearRegression.purs file with your favorite text editor

module LinearRegression ( runLinearRegressionWithUnscaled , calculatePressStatistic ) where

At the top of the file we define our LinearRegression module and export two functions for other modules to use.

import Prelude import Math (pow) import Data.Ord (abs) import Data.Maybe ( Maybe (..), fromMaybe, isNothing) import Data.Either ( Either (..)) import Data.Foldable (sum, foldl) import Data.Array (zip, (!!), range, length) import Data.Tuple ( Tuple (..)) import Utils (lengthNum, containsNothing) import Matrix ( Vector , Matrix , transpose, multiplyMatrices, invertMatrix)

Next we list the various module dependencies (and their functions) we will need to perform our linear regression calculations. Utils and Matrix are custom modules that we will have to write ourselves.

runLinearRegressionWithUnscaled :: Int -> Number -> Number -> Vector -> Vector -> Matrix -> Vector runLinearRegressionWithUnscaled maxIterations maxCost learningRate coefficients regressands unscaledDesignMatrix = if containsNothing unscaledUpdatedCoefficents then [] else map (fromMaybe 0.0 ) unscaledUpdatedCoefficents

The major function in the module is runLinearRegressionWithUnscaled . The UI module will need to call this function with the data points in order to receive the found coefficients.

maxIterations maxCost learningRate coefficients regressands unscaledDesignMatrix

These are the inputs to runLinearRegressionWithUnscaled .

maxIterations is an Int

is an maxCost is a Number

is a learningRate is a Number

is a coefficients is an array of numbers or Vector

is an array of numbers or regressands (the outputs found in the data) is a Vector

(the outputs found in the data) is a unscaledDesignMatrix (the inputs found in the data) is an array of arrays of numbers or Matrix

You can see how that lines up with the following type signature.

runLinearRegressionWithUnscaled :: Int -> Number -> Number -> Vector -> Vector -> Matrix -> Vector

The final Vector on the end will be the found coefficients.

So if our data was the following.

| Regressors | Regressands | |=============|=============| | x : 1 , y : 2 , | z : 3 | | x : 4 , y : 5 , | z : 6 | | x : 7 , y : 8 , | z : 9 |

Then our function call would look like this.

runLinearRegressionWithUnscaled -- function 1000 -- maxIterations (configuration) 0.0001 -- maxCost (configuration) 0.5 -- learningRate (initial guess) [ 1.0 , 1.0 , 1.0 ] -- coefficients (initial guess) [ 3.0 , 6.0 , 9.0 ] -- regressands (the Zs) [[ 1.0 , 2.0 ], [ 4.0 , 5.0 ], [ 7.0 , 8.0 ]] -- unscaledDesignMatrix (the Xs and Ys)

After making this function call, the output will be the best coefficients found.

hypothesis :: Vector -> Vector -> Number hypothesis coefficients regressors = sum $ map (\ ( Tuple c r) -> c * r) tuples where tuples = zip coefficients ([ 1.0 ] <> regressors)

Given the found coefficients and a row of input data ( regressors ), the hypothesis function outputs a prediction.

[ { x : 1 , y : 1 } , { x : 2 , y : 2 } , { x : 3 , y : 3 } , { x : 4 , y : 4 } , { x : 5 , y : 5 } , { x : 6 , y : 6 } ]

Say our data looked like the x and y pairs up above.

[ 0.0 , 1.0 ]

And our coefficients were zero and one.

hypothesis [ 0.0 , 1.0 ] [ 7.0 ] = 7.0

Then the hypothesis would output 7.0 .

tuples = zip coefficients ([ 1.0 ] <> regressors)

Notice that we add a 1.0 to the beginning of the regressors list.

hypothesis [ 0.0 , 1.0 ] [ 7.0 ] = coefficient[ 0 ] * regressor[ 0 ] + coefficient[ 1 ] * regressor[ 1 ] = 0.0 * 1.0 + 1.0 * 7.0 = 7.0

This makes the computation convenient as the first coefficient doesn’t have a corresponding input/regressor/explanatory variable.

calculatePressStatistic :: Vector -> Vector -> Matrix -> Maybe Number calculatePressStatistic [] _ _ = Nothing calculatePressStatistic _ [] _ = Nothing calculatePressStatistic _ _ [] = Nothing calculatePressStatistic regressands coefficients designMatrix = maybeSummation where -- ...

The other major function in the module is calculatePressStatistic . The UI will use this as well to display the PRESS statistic to the user after every runLinearRegressionWithUnscaled .

Below is the entire src/LinearRegression.purs file. Make sure to copy it over.

{- (C) 2017 David Lettier lettier.com -} module LinearRegression ( runLinearRegressionWithUnscaled , calculatePressStatistic ) where import Prelude import Math (pow) import Data.Ord (abs) import Data.Maybe ( Maybe (..), fromMaybe, isNothing) import Data.Either ( Either (..)) import Data.Foldable (sum, foldl) import Data.Array (zip, (!!), range, length) import Data.Tuple ( Tuple (..)) import Utils (lengthNum, containsNothing) import Matrix ( Vector , Matrix , transpose, multiplyMatrices, invertMatrix) runLinearRegressionWithUnscaled :: Int -> Number -> Number -> Vector -> Vector -> Matrix -> Vector runLinearRegressionWithUnscaled maxIterations maxCost learningRate coefficients regressands unscaledDesignMatrix = if containsNothing unscaledUpdatedCoefficents then [] else map (fromMaybe 0.0 ) unscaledUpdatedCoefficents where transposedUnscaledDesignMatrix = transpose unscaledDesignMatrix maybeMeans = map mean transposedUnscaledDesignMatrix maybeStandardDeviations = map (\ ( Tuple m v) -> standardDeviation m v ) (zip maybeMeans transposedUnscaledDesignMatrix) scaledTransposedDesignMatrix = scaleTransposedDesignMatrix maybeMeans maybeStandardDeviations transposedUnscaledDesignMatrix scaledDesignMatrix = transpose scaledTransposedDesignMatrix scaledUpdatedCoefficents = runLinearRegressionWithScaled 0 maxIterations maxCost learningRate coefficients regressands scaledDesignMatrix unscaledUpdatedCoefficents = unscaleCoefficients maybeMeans maybeStandardDeviations scaledUpdatedCoefficents runLinearRegressionWithScaled :: Int -> Int -> Number -> Number -> Vector -> Vector -> Matrix -> Vector runLinearRegressionWithScaled currentIteration maxIterations maxCost learningRate coefficients regressands scaledDesignMatrix = if currentCalculatedCost <= maxCost || currentIteration >= maxIterations || abs (currentCalculatedCost - previousCalculatedCost) == 0.0 then coefficients' else runLinearRegressionWithScaled (currentIteration + 1 ) maxIterations maxCost learningRate' coefficients' regressands scaledDesignMatrix where updatedCoefficients = updateCoefficients learningRate coefficients scaledDesignMatrix regressands previousCalculatedCost = calculateCost coefficients scaledDesignMatrix regressands currentCalculatedCost = calculateCost updatedCoefficients scaledDesignMatrix regressands -- Bold Driver - http://www.willamette.edu/~gorr/classes/cs449/momrate.html coefficients' = if previousCalculatedCost < currentCalculatedCost then coefficients else updatedCoefficients learningRate' = if previousCalculatedCost < currentCalculatedCost then learningRate - (learningRate * 0.5 ) else learningRate + (learningRate * 0.5 ) hypothesis :: Vector -> Vector -> Number hypothesis coefficients regressors = sum $ map (\ ( Tuple c r) -> c * r) tuples where tuples = zip coefficients ([ 1.0 ] <> regressors) calculateCost :: Vector -> Matrix -> Vector -> Number calculateCost _ _ [] = 0.0 calculateCost coefficients designMatrix regressands = sum errors / size where size = lengthNum regressands regressands' = map (hypothesis coefficients) designMatrix errors = map (\ ( Tuple y' y) -> pow (y' - y) 2.0 ) (zip regressands' regressands) partialDerivative :: Int -> Vector -> Matrix -> Vector -> Number partialDerivative _ [] _ _ = 0.0 partialDerivative _ _ [] _ = 0.0 partialDerivative _ _ _ [] = 0.0 partialDerivative i coefficients designMatrix regressands = if i < 0 then 0.0 else result where size = lengthNum regressands tuples = zip regressands designMatrix result = ( 1.0 / size) * (sum (map mapper tuples)) mapper :: Tuple Number Vector -> Number mapper ( Tuple _ []) = 0.0 mapper ( Tuple y regressors) = ((hypothesis coefficients regressors) - y) * xj where xj = if i == 0 then 1.0 else (fromMaybe 0.0 (regressors !! (i - 1 ))) updateCoefficients :: Number -> Vector -> Matrix -> Vector -> Vector updateCoefficients learningRate coefficients designMatrix regressands = result where paritalDerivativeMapper i = partialDerivative i coefficients designMatrix regressands gradient = map paritalDerivativeMapper (range 0 (length coefficients - 1 )) result = map (\ ( Tuple o g) -> o - (learningRate * g)) (zip coefficients gradient) scaleTransposedDesignMatrix :: Array ( Maybe Number ) -> Array ( Maybe Number ) -> Matrix -> Matrix scaleTransposedDesignMatrix [] _ _ = [] scaleTransposedDesignMatrix _ [] _ = [] scaleTransposedDesignMatrix _ _ [] = [] scaleTransposedDesignMatrix maybeMeans maybeStandardDeviations transposedDesignMatrix = map scaleVector range' where scaleValue' :: Maybe Number -> Maybe Number -> Number -> Number scaleValue' mm ms e = fromMaybe e (scaleValue mm ms e) scaleVector :: Int -> Vector scaleVector i = map (scaleValue' (getMaybeMean i) (getMaybeStandardDeviation i)) (getRow i) range' :: Array Int range' = range 0 (length transposedDesignMatrix - 1 ) getMaybeMean :: Int -> Maybe Number getMaybeMean i = fromMaybe Nothing (maybeMeans !! i) getMaybeStandardDeviation :: Int -> Maybe Number getMaybeStandardDeviation i = fromMaybe Nothing (maybeStandardDeviations !! i) getRow :: Int -> Vector getRow i = fromMaybe [] (transposedDesignMatrix !! i) scaleValue :: Maybe Number -> Maybe Number -> Number -> Maybe Number scaleValue _ ( Just 0.0 ) unscaledValue = Nothing scaleValue ( Just mean') ( Just standardDeviation') unscaledValue = Just ((unscaledValue - mean') / standardDeviation') scaleValue _ _ _ = Nothing unscaleCoefficients :: Array ( Maybe Number ) -> Array ( Maybe Number ) -> Vector -> Array ( Maybe Number ) unscaleCoefficients [] _ _ = [] unscaleCoefficients _ [] _ = [] unscaleCoefficients _ _ [] = [] unscaleCoefficients maybeMeans maybeStandardDeviations coefficients = if length maybeMeans /= length maybeStandardDeviations || length maybeMeans /= (length coefficients - 1 ) || containsNothing maybeMeans || containsNothing maybeStandardDeviations then [] else map (\ i -> extractAndApply i unscale) (range 0 (length coefficients - 1 )) where maybeMeans' = [ Nothing ] <> maybeMeans maybeStandardDeviations' = [ Nothing ] <> maybeStandardDeviations extractAndApply :: Int -> ( Int -> Number -> Number -> Number -> Maybe Number ) -> Maybe Number extractAndApply index f = if index < 0 || index >= length maybeMeans' || index >= length maybeStandardDeviations' || index >= length coefficients then Nothing else f index (fromMaybe 0.0 maybeMean) (fromMaybe 0.0 maybeStandardDeviation) coefficient where maybeMean :: Maybe Number maybeMean = fromMaybe Nothing (maybeMeans' !! index) maybeStandardDeviation :: Maybe Number maybeStandardDeviation = fromMaybe Nothing (maybeStandardDeviations' !! index) coefficient :: Number coefficient = fromMaybe 0.0 (coefficients !! index) unscale :: Int -> Number -> Number -> Number -> Maybe Number unscale 0 _ _ coefficient = if containsNothing summands then Nothing else Just (coefficient - summation) where summation :: Number summation = foldl (\ acc x -> acc + (fromMaybe 0.0 x)) 0.0 summands summands :: Array ( Maybe Number ) summands = map (\ i -> extractAndApply i (\ _ m s c -> if s == 0.0 then Nothing else Just (c * (m / s)) ) ) (range 1 (length coefficients - 1 )) unscale _ _ 0.0 _ = Nothing unscale index _ std coefficient = if index < 0 then Nothing else Just (coefficient / std) calculatePressStatistic :: Vector -> Vector -> Matrix -> Maybe Number calculatePressStatistic [] _ _ = Nothing calculatePressStatistic _ [] _ = Nothing calculatePressStatistic _ _ [] = Nothing calculatePressStatistic regressands coefficients designMatrix = maybeSummation where predictions :: Vector predictions = map (\ regressors -> hypothesis coefficients regressors) designMatrix residuals :: Vector residuals = map (\ ( Tuple y y') -> y - y') (zip regressands predictions) hatMatrix :: Matrix hatMatrix = fromMaybe [] (calculateHatMatrix designMatrix) folder :: Maybe Number -> Int -> Maybe Number folder Nothing _ = Nothing folder ( Just acc) i = if isNothing maybeResidual || isNothing maybeTerm || 1.0 - term == 0.0 then Nothing else Just (acc + (pow (residual / ( 1.0 - term)) 2.0 )) where maybeResidual :: Maybe Number maybeResidual = residuals !! i residual :: Number residual = fromMaybe 0.0 maybeResidual row :: Vector row = fromMaybe [] (hatMatrix !! i) maybeTerm :: Maybe Number maybeTerm = row !! i term :: Number term = fromMaybe 0.0 maybeTerm maybeSummation :: Maybe Number maybeSummation = foldl folder ( Just 0.0 ) (range 0 (length regressands - 1 )) calculateHatMatrix :: Matrix -> Maybe Matrix calculateHatMatrix [] = Nothing calculateHatMatrix designMatrix = xxTxIxT where xT :: Matrix xT = transpose designMatrix xTx :: Matrix xTx = fromMaybe [] (multiplyMatrices xT designMatrix) xTxI :: Matrix xTxI = case invertMatrix xTx of ( Tuple ( Right i) m) -> m _ -> [] xxTxI :: Matrix xxTxI = fromMaybe [] (multiplyMatrices designMatrix xTxI) xxTxIxT :: Maybe Matrix xxTxIxT = multiplyMatrices xxTxI xT standardDeviation :: Maybe Number -> Vector -> Maybe Number standardDeviation _ [] = Nothing standardDeviation ( Just mean') es = Just (pow base 0.5 ) where oneOverLength = 1.0 / lengthNum es summation = sum $ map (\ e -> pow (e - mean') 2.0 ) es base = oneOverLength * summation standardDeviation _ _ = Nothing mean :: Vector -> Maybe Number mean [] = Nothing mean es = Just (sum es / lengthNum es)

UI.purs

The src/UI.purs file contains all of the logic to power our user interface—the buttons, graph, and input boxes.

module UI where

Here we define our UI module and implicitly export every function it defines.

import Prelude import Data.Generic (gShow) import Data.Foldable (foldr) import Data.Array (drop, (:), length, head, last, range) import Data.List.Lazy (replicateM, (!!)) import Data.Maybe ( Maybe (..), isNothing, fromMaybe, isJust) import Control.Monad.Eff.Random ( RANDOM , randomRange) import Control.Monad.Aff ( Aff ) import Control.Monad.Aff.Free (class Affable ) import Control.Monad.Aff.Console ( CONSOLE , log) import Halogen as H import Halogen.HTML.Core (className) import Halogen.HTML.Events.Indexed as HE import Halogen.HTML.Properties.Indexed as HP import Halogen.HTML.Indexed as HH import LinearRegression ( runLinearRegressionWithUnscaled , calculatePressStatistic ) import Plot ( PLOT , PlotData , makePlot) import Utils (maybeNumberToString, stringToMaybeNumber, arrayMinOrMax)

Here we list all of the various imports we will need. Halogen will do the heavy lifting of working with the document object model or DOM. Plot is a module we will write. It provides an application programming interface or API to draw or plot our graph on the page.

type Effects eff = H.HalogenEffects ( plot :: PLOT , console :: CONSOLE , random :: RANDOM | eff)

Our interface will have to perform various (side) effects that will interact with the “outside world”. We will have to plot our graph, write out to the console , and generate random numbers. You can think of this syntax as “whitelisting” the effects we know we will perform.

type Point = { id :: Int , x :: Maybe Number , y :: Maybe Number }

We will define our point data structure as having an id and a x and y coordinate. The Maybe Number type indicates that x and/or y maybe a number or Nothing (empty).

data Query a = PushPoint a | PopPoint a | RemovePoint Int a | RandomPoints a | UpdatePointCoordinate Int String String a | RunLinearRegression a

Our UI will consist of one big component—a coherent element or widget on the page. These “queries” are actions that will change the state or current configuration of our component. Queries could also be requests that return information about our component but we don’t have a need for those.

type State = { nextId :: Int , points :: Array Point , yIntercept :: Maybe Number , slope :: Maybe Number , pressStatistic :: Maybe Number , running :: Boolean }

This is our state data structure. It holds all of the configuration for our component.

nextId - the next highest identifier for a new point

- the next highest identifier for a new point points - a list of points

- a list of points yIntercept - the first coefficient

- the first coefficient slope - the second or last coefficient

- the second or last coefficient pressStatistic - the PRESS statistic indicating how well our model fits

- the PRESS statistic indicating how well our model fits running - a flag indicating that we are or are not running the linear regression algorithm If true, the input boxes cannot be updated

- a flag indicating that we are or are not running the linear regression algorithm

initialState :: State initialState = { nextId : 0 , points : [] , yIntercept : Nothing , slope : Nothing , pressStatistic : Nothing , running : false }

Here you see the initialState function that returns the beginning state for our component. In other words, this is the state our component will be in when you first load the page.

ui :: forall eff . H.Component State Query ( Aff ( Effects eff)) ui = H.component { render, eval } where render :: State -> H.ComponentHTML Query render state = HH.div_ [ -- ... ]

This is our component. It defines how to render or draw itself based on its current state. The component also defines how to evaluate the queries, or in our case the actions, we listed earlier.

eval :: Query ~> H.ComponentDSL State Query ( Aff ( Effects eff)) eval ( PushPoint next) = do H.modify (\ state -> state { nextId = state . nextId + 1 , points = newPoint state . nextId : state . points , yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing } ) pure next

In this action we push a new point onto the state points stack.

eval ( PopPoint next) = do H.modify (\ state -> state { points = drop 1 state . points , yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing } ) currentState <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState)) pure next

In this action we pop the top off of the state points stack (we remove a point). Since we added a point, we update our plot on the page with this new data point.

eval ( RemovePoint id next) = do H.modify (\ state -> state { points = foldr (\ point acc -> if point . id == id then acc else point : acc ) [] state . points , yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing } ) currentState <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState)) pure next

In this query we find a specific point by its id and remove it from the list of points in the state. Since we removed a point, we update our plot on the page with this new data point.

eval ( RandomPoints next) = do let numberOfPoints = 10 xs <- H.fromEff (replicateM numberOfPoints (randomRange ( - 10.0 ) 10.0 )) ys <- H.fromEff (replicateM numberOfPoints (randomRange ( - 10.0 ) 10.0 )) H.modify (\ state -> state { points = map (\ i -> { id : i , x : xs !! i , y : ys !! i } ) (range 0 (numberOfPoints - 1 )) , yIntercept = Nothing , slope = Nothing } ) currentState <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState)) pure next

As a convenience, we can produce a list of random points to rapidly fill the component state with points. As before, we update our plot on the page with the state’s new list of data points.

eval ( UpdatePointCoordinate id key value next) = do H.modify (\ state -> state { points = foldr (\ point acc -> ( if point . id == id then updatePointCoordinateFromString point key value else point) : acc ) [] state . points , yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing } ) currentState <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState)) pure next

If a user updates a point, we find it in the list and update its data structure based on the input boxes for it in the UI. Once we update the point, we update its position on the plot.

eval ( RunLinearRegression next) = do H.modify (\ state -> state { yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing , running = true } ) currentState <- H.get let linearRegressionData = dataForLinearRegressionFromState currentState let result = runLinearRegressionWithUnscaled linearRegressionData . maxIterations linearRegressionData . maxCost linearRegressionData . learningRate linearRegressionData . coefficients linearRegressionData . regressands linearRegressionData . designMatrix let pressStatistic = calculatePressStatistic linearRegressionData . regressands result linearRegressionData . designMatrix log' (gShow result) H.modify (\ state -> state { yIntercept = head result , slope = last result , pressStatistic = pressStatistic , running = false } ) if isJust (head result) && isJust (last result) then do currentState' <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState')) pure next else pure next pure next

Lastly we come to the most important action where we run our linear regression algorithm on the state’s current points. After that, we run our PRESS statistic algorithm. If everything was successful, we update the state with the results. The results being the only two coefficients (the y-intercept and slope) and the PRESS statistic. Note that we update running as false so the user can begin altering the state’s points again.

Make sure to break out your favorite text editor and copy the following into src/UI.purs .

{- (C) 2017 David Lettier lettier.com -} module UI where import Prelude import Data.Generic (gShow) import Data.Foldable (foldr) import Data.Array (drop, (:), length, head, last, range, filter) import Data.List.Lazy (replicateM, (!!)) import Data.Maybe ( Maybe (..), isNothing, fromMaybe, isJust) import Control.Monad.Eff.Random ( RANDOM , randomRange) import Control.Monad.Aff ( Aff ) import Control.Monad.Aff.Free (class Affable ) import Control.Monad.Aff.Console ( CONSOLE , log) import Halogen as H import Halogen.HTML.Core (className) import Halogen.HTML.Events.Indexed as HE import Halogen.HTML.Properties.Indexed as HP import Halogen.HTML.Indexed as HH import LinearRegression ( runLinearRegressionWithUnscaled , calculatePressStatistic ) import Plot ( PLOT , PlotData , makePlot) import Utils (maybeNumberToString, stringToMaybeNumber, arrayMinOrMax) type Effects eff = H.HalogenEffects ( plot :: PLOT , console :: CONSOLE , random :: RANDOM | eff) type Point = { id :: Int , x :: Maybe Number , y :: Maybe Number } data Query a = PushPoint a | PopPoint a | RemovePoint Int a | RandomPoints a | UpdatePointCoordinate Int String String a | RunLinearRegression a type State = { nextId :: Int , points :: Array Point , yIntercept :: Maybe Number , slope :: Maybe Number , pressStatistic :: Maybe Number , running :: Boolean } ui :: forall eff . H.Component State Query ( Aff ( Effects eff)) ui = H.component { render, eval } where render :: State -> H.ComponentHTML Query render state = HH.div_ [ HH.div_ [ HH.div_ [ HH.b_ [ HH.text "Status: " ] , HH.text if state . running then "Running" else if hasValidPoints state then "Press run" else "Add points" ] , HH.div_ [ HH.b_ [ HH.text "Y-Intercept: " ] , HH.text if isJust state . yIntercept && hasValidPoints state then "" <> (maybeNumberToString state . yIntercept) else "?" ] , HH.div_ [ HH.b_ [ HH.text "Slope: " ] , HH.text if isJust state . slope && hasValidPoints state then "" <> (maybeNumberToString state . slope) else "?" ] , HH.div_ [ HH.b_ [ HH.text "PRESS Statistic: " ] , HH.text if isJust state . pressStatistic && hasValidPoints state then "" <> (maybeNumberToString state . pressStatistic) else "?" ] ] , HH.div_ [ HH.button [ HE.onClick (HE.input_ PushPoint ) ] [ HH.text "Push Point" ] , HH.button [ HE.onClick (HE.input_ PopPoint ) ] [ HH.text "Pop Point" ] , HH.button [ HE.onClick (HE.input_ RandomPoints ) , HP.class_ (className "randomPointsButton" ) ] [ HH.text "Random Points" ] , HH.button [ HE.onClick (HE.input_ RunLinearRegression ) , HP.class_ (className "runButton" ) ] [ HH.text "Run" ] , HH.div_ ( map (\ point -> HH.li_ [ HH.input [ HP.value (maybeNumberToString point . x) , HP.placeholder "Input the X Coordinate" , HE.onValueChange (HE.input ( UpdatePointCoordinate point . id "x" )) , HP.disabled state . running ] , HH.input [ HP.value (maybeNumberToString point . y) , HP.placeholder "Input the Y Coordinate" , HE.onValueChange (HE.input ( UpdatePointCoordinate point . id "y" )) , HP.disabled state . running ] , HH.button [ HE.onClick (HE.input_ ( RemovePoint point . id)) , HP.class_ (className "removeButton" ) ] [ HH.text "X" ] ] ) state . points ) ] ] eval :: Query ~> H.ComponentDSL State Query ( Aff ( Effects eff)) eval ( PushPoint next) = do H.modify (\ state -> state { nextId = state . nextId + 1 , points = newPoint state . nextId : state . points , yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing } ) pure next eval ( PopPoint next) = do H.modify (\ state -> state { points = drop 1 state . points , yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing } ) currentState <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState)) pure next eval ( RemovePoint id next) = do H.modify (\ state -> state { points = foldr (\ point acc -> if point . id == id then acc else point : acc ) [] state . points , yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing } ) currentState <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState)) pure next eval ( RandomPoints next) = do let numberOfPoints = 10 xs <- H.fromEff (replicateM numberOfPoints (randomRange ( - 10.0 ) 10.0 )) ys <- H.fromEff (replicateM numberOfPoints (randomRange ( - 10.0 ) 10.0 )) H.modify (\ state -> state { points = map (\ i -> { id : i , x : xs !! i , y : ys !! i } ) (range 0 (numberOfPoints - 1 )) , yIntercept = Nothing , slope = Nothing } ) currentState <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState)) pure next eval ( UpdatePointCoordinate id key value next) = do H.modify (\ state -> state { points = foldr (\ point acc -> ( if point . id == id then updatePointCoordinateFromString point key value else point) : acc ) [] state . points , yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing } ) currentState <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState)) pure next eval ( RunLinearRegression next) = do H.modify (\ state -> state { yIntercept = Nothing , slope = Nothing , pressStatistic = Nothing , running = true } ) currentState <- H.get let linearRegressionData = dataForLinearRegressionFromState currentState let result = runLinearRegressionWithUnscaled linearRegressionData . maxIterations linearRegressionData . maxCost linearRegressionData . learningRate linearRegressionData . coefficients linearRegressionData . regressands linearRegressionData . designMatrix let pressStatistic = calculatePressStatistic linearRegressionData . regressands result linearRegressionData . designMatrix log' (gShow result) H.modify (\ state -> state { yIntercept = head result , slope = last result , pressStatistic = pressStatistic , running = false } ) if isJust (head result) && isJust (last result) then do currentState' <- H.get _ <- H.fromAff (makePlot (makePlotDataFromState currentState')) pure next else pure next pure next initialState :: State initialState = { nextId : 0 , points : [] , yIntercept : Nothing , slope : Nothing , pressStatistic : Nothing , running : false } newPoint :: Int -> Point newPoint id = { id : id, x : Nothing , y : Nothing } updatePointCoordinateFromString :: Point -> String -> String -> Point updatePointCoordinateFromString point "x" s = point { x = stringToMaybeNumber s } updatePointCoordinateFromString point "y" s = point { y = stringToMaybeNumber s } updatePointCoordinateFromString point _ _ = point dataForLinearRegressionFromState :: State -> { maxIterations :: Int , maxCost :: Number , learningRate :: Number , coefficients :: Array Number , regressands :: Array Number , designMatrix :: Array ( Array Number ) } dataForLinearRegressionFromState state = { maxIterations : 10000 , maxCost : 1e-11 , learningRate : 0.05 , coefficients : [ 1.0 , 1.0 ] , regressands : map (\ { y : y } -> fromMaybe 0.0 y ) validPoints , designMatrix : map (\ { x : x } -> [fromMaybe 0.0 x]) validPoints } where validPoints = filter pointIsValid state . points coords :: State -> Array ( Array Number ) coords { points : [] } = [] coords { points : points } = foldr (\ point acc -> if isNothing point . x || isNothing point . y then acc else [fromMaybe 0.0 point . x, fromMaybe 0.0 point . y] : acc ) [] points getXValuesFromState :: State -> Array ( Number ) getXValuesFromState { points : points } = foldr (\ point acc -> if isNothing point . x || isNothing point . y then acc else fromMaybe 0.0 point . x : acc ) [] points makePlotDataFromState :: State -> PlotData makePlotDataFromState state @ { points : points , yIntercept : ( Just yIntercept) , slope : ( Just slope) } = { scatter : pointsToScatterData points , line : if length xValues > 0 then [ { x : minX , y : slope * minX + yIntercept } , { x : maxX , y : slope * maxX + yIntercept } ] else [] } where xValues = getXValuesFromState state minX = arrayMinOrMax min 0.0 xValues maxX = arrayMinOrMax max 0.0 xValues makePlotDataFromState state @ { points : points , yIntercept : Nothing , slope : Nothing } = { scatter : pointsToScatterData points , line : [] } makePlotDataFromState state @ { points : points , yIntercept : ( Just _) , slope : Nothing } = { scatter : pointsToScatterData points , line : [] } makePlotDataFromState state @ { points : points , yIntercept : Nothing , slope : ( Just _) } = { scatter : pointsToScatterData points , line : [] } pointsToScatterData :: Array Point -> Array { x :: Number , y :: Number } pointsToScatterData [] = [] pointsToScatterData points = foldr (\ point acc -> if isNothing point . x || isNothing point . y then acc else { x : fromMaybe 0.0 point . x, y : fromMaybe 0.0 point . y } : acc ) [] points hasValidPoints :: State -> Boolean hasValidPoints { points : [] } = false hasValidPoints state = (length <<< coords) state > 0 pointIsValid :: Point -> Boolean pointIsValid { id : _, x : ( Just x), y : ( Just y) } = true pointIsValid _ = false log' :: forall a b . Affable ( console :: CONSOLE | b) a => String -> a Unit log' string = H.fromAff $ when debug $ log string debug :: Boolean debug = false

Plot.js

To integrate Chart.js, which powers our graph, we’ll need to use PureScript’s foreign function interface or FFI.

Go ahead and copy this into src/Plot.js .

/* global exports */ /* (C) 2017 David Lettier lettier.com */ "use strict" ; var Chart = require ( '../../bower_components/chart.js' ) ; exports . makePlot = ( function () { var ctx = document . getElementById ( "pointChart" ) ; var plot = new Chart ( ctx , { type : 'bar' , data : { datasets : [ { label : '' , type : 'scatter' , data : [] , pointBorderColor : '#87FFBF' , pointBackgroundColor : '#87FFBF' , pointRadius : 7 , showLine : false }, { label : '' , type : 'line' , data : [] , fill : false , borderColor : '#DC84F4' , borderWidth : 5 , pointRadius : 0 , showLine : true } ] } , options : { responsive : false , legend : { display : false } , scales : { xAxes : [ { type : 'linear' , position : 'bottom' , gridLines : { color : '#aaa' , zeroLineColor : '#aaa' } , ticks : { fontColor : '#aaa' } } ] , yAxes : [ { gridLines : { color : '#aaa' , zeroLineColor : '#aaa' } , ticks : { fontColor : '#aaa' } } ] } } } ) ; return function (data) { plot . data . datasets [ 0 ]. data = data . scatter || [] ; plot . data . datasets [ 1 ]. data = data . line || [] ; plot . update () ; }; } ()) ;

Here we only export one function makePlot . This function takes in the points and draws the graph on the page.

Plot.purs

Back over in PureScript, we connect our PureScript code to our JavaScript code in src/Plot.purs .

Go ahead and copy this into src/Plot.purs .

{- (C) 2017 David Lettier lettier.com -} module Plot where import Prelude import Control.Monad.Aff ( Aff ) foreign import data PLOT :: ! foreign import makePlot :: forall aff. PlotData -> Aff (plot :: PLOT | aff) Unit type PlotData = { scatter :: Array { x :: Number , y :: Number }, line :: Array { x :: Number , y :: Number } } emptyPlotData :: PlotData emptyPlotData = { scatter : [], line : [] }

Here the major detail is the foreign import makePlot line. You can see that it takes PlotData , performs the plot update (side-effect), and returns Unit (nothing of interest or void). Note that Aff is short for asynchronous extensible effects.

Matrix.purs

Go ahead now and copy the following over to src/Matrix.purs .

{- (C) 2017 David Lettier lettier.com -} module Matrix where import Prelude import Data.Ord (abs) import Data.Foldable (foldl, sum) import Data.Tuple ( Tuple (..), fst, snd) import Data.Array (head, length, null, range, slice, tail, updateAt, zip, (!!)) import Data.Either ( Either (..)) import Data.Maybe ( Maybe ( Just , Nothing ), fromMaybe, isNothing) type Vector = Array Number type Matrix = Array Vector invertMatrix :: Matrix -> Tuple ( Either String Matrix ) Matrix invertMatrix matrix | null matrix = Tuple ( Left "Empty matrix" ) (identityMatrix (length matrix)) | not $ isSquareMatrix matrix = Tuple ( Left "Not a square matrix" ) (identityMatrix (length matrix)) | containsZeroRowOrCol matrix = Tuple ( Left "Contains zero row or column" ) (identityMatrix (length matrix)) | otherwise = result where matrixSize = length matrix identity = identityMatrix matrixSize result :: Tuple ( Either String Matrix ) Matrix result = foldl folder ( Tuple ( Right matrix) identity) (matrixToRange matrix) folder :: Tuple ( Either String Matrix ) Matrix -> Int -> Tuple ( Either String Matrix ) Matrix folder t @ ( Tuple acc @ ( Left _) _) _ = t folder t @ ( Tuple acc @ ( Right a) b) diagonal = if containsZeroRowOrCol a || null a || isNothing maybeDivisorA || divisorA == 0.0 then ( Tuple ( Left "Cannot invert" ) b) else ( Tuple ( Right clearedA) clearedB) where swapped = swapWithValidRow ( Tuple a b) diagonal swappedA = fst swapped swappedB = snd swapped divisorRowA = fromMaybe [] (swappedA !! diagonal) maybeDivisorA = divisorRowA !! diagonal divisorA = fromMaybe defaultMatrixValue maybeDivisorA multiplierA = 1.0 / divisorA multipliedA = multiplyRow swappedA diagonal multiplierA multipliedB = multiplyRow swappedB diagonal multiplierA cleared = clearColumnExceptPivot ( Tuple multipliedA multipliedB) ( Tuple diagonal diagonal) clearedA = fst cleared clearedB = snd cleared multiplyMatrices :: Matrix -> Matrix -> Maybe Matrix multiplyMatrices _ [] = Nothing multiplyMatrices [] _ = Nothing multiplyMatrices aMat bMat = if aMatColNum /= bMatRowNum then Nothing else Just cMat where aMatColNum :: Int aMatColNum = foldl (\ acc r -> length r) 0 aMat bMatRowNum :: Int bMatRowNum = length bMat tBMat :: Matrix tBMat = transpose bMat cMat :: Matrix cMat = map (\ r -> map (\ c -> sum $ map (\ ( Tuple a b) -> a * b ) (zip r c) ) tBMat ) aMat defaultMatrix :: Int -> Matrix defaultMatrix size = buildMatrix size value where value _ _ = defaultMatrixValue identityMatrix :: Int -> Matrix identityMatrix size = buildMatrix size value where value row col | row == col = 1.0 | otherwise = 0.0 buildMatrix :: Int -> ( Int -> Int -> Number ) -> Matrix buildMatrix 0 _ = [] buildMatrix size f = map (\ row -> map (\ col -> f row col) matrixRange) matrixRange where matrixRange = range 0 (size - 1 ) matrixSizeValid :: Int -> Boolean matrixSizeValid size = size <= maxMatrixSize && size >= minMatrixSize maybeMatrixValue :: Matrix -> Int -> Int -> Maybe Number maybeMatrixValue matrix row col = case matrix !! row of Nothing -> Nothing Just x -> x !! col matrixRow :: Matrix -> Int -> Vector matrixRow matrix i = fromMaybe [] (matrix !! i) firstValidRow :: Matrix -> Int -> Int -> Maybe Int firstValidRow [] _ _ = Nothing firstValidRow matrix atRowOrBelow inCol = (foldl folder { index : Nothing , value : Nothing } tuples) . index where matrixLength = length matrix lastRow = matrixLength - 1 rowRange = range atRowOrBelow lastRow column = slice atRowOrBelow matrixLength (matrixRow (transpose matrix) inCol) tuples = zip rowRange column folder :: { index :: Maybe Int , value :: Maybe Number } -> Tuple Int Number -> { index :: Maybe Int , value :: Maybe Number } folder { index : Nothing , value : Nothing } ( Tuple a b) = { index : ( Just a), value : ( Just b) } folder r @ { index : ( Just x), value : ( Just y)} ( Tuple a b) = if abs y == 1.0 then r else if abs b >= abs y || abs b == 1.0 then { index : ( Just a), value : ( Just b) } else r folder _ _ = { index : Nothing , value : Nothing } swapWithValidRow :: Tuple Matrix Matrix -> Int -> Tuple Matrix Matrix swapWithValidRow t @ ( Tuple a b) diagonal = Tuple a' b' where row = firstValidRow a diagonal diagonal yesSwap = case row of Nothing -> false Just r -> r /= diagonal swapIfYes m = if yesSwap then swapRows m diagonal (fromMaybe diagonal row) else m a' = swapIfYes a b' = swapIfYes b transpose :: Matrix -> Matrix transpose [] = [] transpose matrix = transpose' matrix [] where transpose' :: Matrix -> Matrix -> Matrix transpose' old new = if arrayHasNothing maybeHeads then new else transpose' tails (new <> [heads]) where maybeHeads = map head old maybeTails = map tail old heads = map (fromMaybe 0.0 ) maybeHeads tails = map (fromMaybe []) maybeTails arrayHasNothing :: forall a . Array ( Maybe a) -> Boolean arrayHasNothing array = case array !! 0 of Nothing -> true ( Just x) -> isNothing x swapRows :: Matrix -> Int -> Int -> Matrix swapRows [] _ _ = [] swapRows matrix a b = fromMaybe [] (updateRow b rowA (updateRow a rowB ( Just matrix))) where rowA :: Maybe Vector rowA = matrix !! a rowB :: Maybe Vector rowB = matrix !! b updateRow :: Int -> Maybe Vector -> Maybe Matrix -> Maybe Matrix updateRow i ( Just row) ( Just matrix') = updateAt i row matrix' updateRow _ _ _ = Nothing multiplyRow :: Matrix -> Int -> Number -> Matrix multiplyRow [] _ _ = [] multiplyRow matrix row multiplier = map (\ row' -> if row == row' then map (\ value -> value * multiplier) (matrixRow matrix row') else matrixRow matrix row' ) (matrixToRange matrix) clearValue :: Tuple Matrix Matrix -> Tuple Int Int -> Tuple Int Int -> Tuple Matrix Matrix clearValue ( Tuple aMat bMat) pivot @ ( Tuple pRow pCol) target @ ( Tuple tRow tCol) = Tuple aMat' bMat' where pivotRowA :: Vector pivotRowA = matrixRow aMat pRow targetRowA :: Vector targetRowA = matrixRow aMat tRow pivotValueA :: Number pivotValueA = rowValue pivotRowA pCol targetValueA :: Number targetValueA = rowValue targetRowA tCol aMat' :: Matrix aMat' = multiplyAndSubtractRows aMat tRow pRow pivotValueA targetValueA bMat' :: Matrix bMat' = multiplyAndSubtractRows bMat tRow pRow pivotValueA targetValueA rowValue :: Vector -> Int -> Number rowValue row col = fromMaybe defaultMatrixValue (row !! col) multiplyAndSubtractRows :: Matrix -> Int -> Int -> Number -> Number -> Matrix multiplyAndSubtractRows matrix leftSideRow rightSideRow leftMultiplier rightMultiplier = map (\ row -> if row == leftSideRow then leftRowValues' else matrixRow matrix row ) (matrixToRange matrix) where leftRowValues = matrixRow matrix leftSideRow rightRowValues = matrixRow matrix rightSideRow leftRowValues' = map (\ ( Tuple l r) -> leftMultiplier * l - rightMultiplier * r ) (zip leftRowValues rightRowValues) clearColumnExceptPivot :: Tuple Matrix Matrix -> Tuple Int Int -> Tuple Matrix Matrix clearColumnExceptPivot t @ ( Tuple aMat bMat) pivot @ ( Tuple pRow pCol) = t' where t' :: Tuple Matrix Matrix t' = foldl folder t (matrixToRange aMat) folder :: Tuple Matrix Matrix -> Int -> Tuple Matrix Matrix folder acc @ ( Tuple aMat' bMat') row = if row == pRow then acc else clearValue acc pivot ( Tuple row pCol) containsZeroRowOrCol :: Matrix -> Boolean containsZeroRowOrCol matrix = containsZeroRow matrix || containsZeroCol matrix containsZeroCol :: Matrix -> Boolean containsZeroCol = containsZeroRow <<< transpose containsZeroRow :: Matrix -> Boolean containsZeroRow = foldl (\ acc row -> acc || foldl (\ acc' value -> acc' && value == 0.0 ) true row) false isSquareMatrix :: Matrix -> Boolean isSquareMatrix matrix = foldl (\ acc row -> acc && length row == matrixLength) true matrix where matrixLength = length matrix matrixToRange :: Matrix -> Array Int matrixToRange [] = [] matrixToRange matrix = (range 0 (length matrix - 1 )) maxMatrixSize :: Int maxMatrixSize = 10 minMatrixSize :: Int minMatrixSize = 2 defaultMatrixValue :: Number defaultMatrixValue = 0.0

Note that we use this module to compute the Hat Matrix used in the PRESS statistic calculation.

Utils.purs

Go ahead and copy the following into src/Utils.purs .

{- (C) 2017 David Lettier lettier.com -} module Utils where import Prelude import Global (readFloat, isNaN, isFinite) import Data.Generic (gShow) import Data.Array (length, head) import Data.Maybe ( Maybe (..), fromMaybe, isNothing) import Data.Foldable (foldl) import Data.Int (toNumber) firstOrSecondValues :: ( Array Number -> Maybe Number ) -> Array ( Array Number ) -> Array Number firstOrSecondValues f [] = [] firstOrSecondValues f es = foldl innerFold [] (map f es) where innerFold acc ( Just e) = acc <> [e] innerFold acc Nothing = acc first :: Array Number -> Maybe Number first [x, y] = Just x first _ = Nothing second :: Array Number -> Maybe Number second [x, y] = Just y second _ = Nothing lengthNum :: forall a . Array a -> Number lengthNum [] = 0.0 lengthNum xs = (toNumber <<< length) xs arrayNumberHandler :: ( Number -> Number -> Number ) -> Array Number -> Number arrayNumberHandler f [x, y] = f x y arrayNumberHandler _ _ = 0.0 stringToMaybeNumber :: String -> Maybe Number stringToMaybeNumber s = maybeNumber where float :: Number float = readFloat s maybeNumber :: Maybe Number maybeNumber = if isNaN float then Nothing else Just float maybeNumberToString :: Maybe Number -> String maybeNumberToString Nothing = "" maybeNumberToString ( Just a) = gShow a isInfinity :: Number -> Boolean isInfinity = not <<< isFinite arrayMinOrMax :: forall a . ( Ord a) => (a -> a -> a) -> a -> Array a -> a arrayMinOrMax _ default [] = default arrayMinOrMax f _ [h] = h arrayMinOrMax f default values = foldl (\ acc value -> f acc value) (fromMaybe default (head values)) values numberInvalid :: Number -> Boolean numberInvalid n = isNaN n || isInfinity n containsNothing :: forall a . Array ( Maybe a) -> Boolean containsNothing xs = foldl (\ acc x -> acc || isNothing x) false xs

These are just a loose collection of functions that provide convenience.

Main.purs

At long last we reach the final PureScript file. This file is the starting point for the logic of the calculator.

Make to sure to copy it over to src/Main.purs .

{- (C) 2017 David Lettier lettier.com -} module Main where import Prelude import Data.Maybe (fromMaybe) import Control.Monad.Eff ( Eff ) import Halogen as H import Halogen.Util (awaitBody, runHalogenAff, selectElement) import Plot (emptyPlotData, makePlot) import UI ( Effects , ui, initialState) main :: Eff ( Effects ()) Unit main = runHalogenAff do body <- awaitBody uiContainer <- selectElement "#uiContainer" H.runUI ui initialState (fromMaybe body uiContainer) makePlot emptyPlotData

The important detail here is the main function. This is the start of the application (like C/C++’s main).

main accepts and returns nothing. Notice how we whitelist the Effects we defined in src/UI.purs . In procedural fashion, we get the page body and our uiContainer , fire up our UI component, and initialize the plot on the page.

Index.scss

We will need some style so copy this to static/scss/index.scss .

/* (C) 2017 David Lettier lettier.com */ $laptopLWidth : 1440px; $laptopWidth : 1024px; $tabletWidth : 768px; $color1 : rgba(51, 55, 69, 1); $color2 : rgba(245, 47, 87, 1); $color3 : rgba(76, 75, 99, 1); $color4 : rgba(175, 190, 209, 1); $color5 : rgba(237, 237, 244, 1); @mixin laptopL { @media ( max-width: #{$laptopLWidth } ) { @content; } } @mixin laptop { @media ( max-width: #{$laptopWidth } ) { @content; } } @mixin tablet { @media ( max-width: #{$tabletWidth } ) { @content; } } body { font-family: sans-serif ; margin-top: 25px ; margin-left: 25px ; background-color: #343438 ; color: white ; height: 100% ; line-height: 2 ; } input { margin-right: 4px ; outline: none ; border: none ; border-radius: 1px ; padding: 2px ; font-size: 20px ; width: 210px ; background-color: whitesmoke ; color: #222 ; } li { list-style: none ; margin-top: 3px ; } button { background-color: #6f6cb9 ; color: white ; font-weight: bold ; margin: 2px ; border-style: none ; border-radius: 1px ; outline: none ; font-size: 20px ; cursor: pointer ; -webkit-box-shadow: 0px 3px 3px 0px #190d25 ; -moz-box-shadow: 0px 3px 3px 0px #190d25 ; box-shadow: 0px 3px 3px 0px #190d25 ; } #pageContainer { height: 100% ; } #chartContainer { margin-right: 10px ; } #instructionsContainer { margin-bottom: 20px ; } #uiContainer { margin-left: 10px ; } #logoContainer { position: absolute ; bottom: 0px ; right: 0px ; clear: both ; } #logo { width: 250px ; height: 250px ; @include laptopL { width: 200px ; height: 200px ; } @include laptop { width: 150px ; height: 150px ; } @include tablet { width: 100px ; height: 100px ; } } .runButton { background-color: #00c17c ; } .removeButton { background-color: #f52f57 ; } .randomPointsButton { background-color: #4d98c5 ; } .defaultCursor { cursor: default ; } .row { display: flex ; flex-direction: row ; }

Index.html

This is the skeleton of our application. Once our component runs, the markup will change.

Make sure to copy this to static/html/index.html .

<!DOCTYPE html > <!--(C) 2017 David Lettier--> <html> <head> <title> Simple Linear Regression | Lettier.com </title> <link rel= "stylesheet" href= "index.css" > </head> <body> <div id= "pageContainer" > <div id= "instructionsContainer" > <h1> Simple Linear Regression </h1> <span> Add points by pressing <button class= "defaultCursor" > Push Point </button> . </span> <br> <span> Remove points by pressing <button class= "defaultCursor" > Pop Point </button> or <button class= "removeButton defaultCursor" > X </button> . </span> <br> <span> Find the y-intercept and the slope by pressing <button class= "runButton defaultCursor" > Run </button> . </span> </div> <div class= "row" > <div id= "chartContainer" > <canvas id= "pointChart" width= "400" height= "400" ></canvas> </div> <div id= "uiContainer" > </div> </div> </div> <script src= "app.js" ></script> </body> </html>

Build

Now that all of the code is in place, we can build the project and try out the calculator.

Make sure to run the following commands.

cd simple-linear-regression yarn run buildDist xdg-open dist/index.html # use `open` if on macOS

The Amazing Beard

With the calculator running, let’s test it out with a made-up scenario.

Imagine that you are a tax collector. In your country beards are taxed based on their length. One day your boss tells you that The Amazing Beard—a circus performer with a rather large beard—has not paid their taxes for the last 10 years. They ask you to go and collect the back taxes as well as collect the estimated tax for the current year.

You travel to the meet The Amazing Beard and ask to see their beard records going back 10 years.

| Year | Beard Length in feet | |======|======================| | 5 | 10.111222679314832 | | 6 | 12.669423969872254 | | 7 | 14.380089343033626 | | 8 | ? | | 9 | 17.223040278505103 | | 10 | 19.85133414591424 | | 11 | 22.281510710653695 | | 12 | 24.847244202509092 | | 13 | ? | | 14 | 27.001121387722204 | |------|----------------------| | 15 | ? | Current Year

As you search over the data, you notice two of the years are missing. Also, you are not sure what the beard length will be at the end of the current year.

The truth is, the beard actually grows by f(year) = 2 * year + 0 + e where e is random noise in the range [-1, 1] . But you don’t know this—you are only given the data points they recorded. Fortunately, you have the calculator with you.

You can see that the fitted model is not perfect but unbeknownst to you, the truth contains noise or irreducible error. Anyway, with your fitted model of f(year) = 1.9208460314074158 * year + 0.7777975490014306 , you can now fill out previous years eight and 13 as well as the current year 15. Note that while year 15 is an extrapolation, the PRESS statistic is fairly low.

| Year | Beard Length in feet | |======|======================| | 5 | 10.111222679314832 | | 6 | 12.669423969872254 | | 7 | 14.380089343033626 | | 8 | 16.144565800260757 | | 9 | 17.223040278505103 | | 10 | 19.85133414591424 | | 11 | 22.281510710653695 | | 12 | 24.847244202509092 | | 13 | 25.748795957297837 | | 14 | 27.001121387722204 | |------|----------------------| | 15 | 29.59048802011267 | Current Year

With the missing data filled in, you collect the owed tax and return to your boss successful.

Recap

From the ground up, we implemented linear regression, gradient descent, and the PRESS statistic. Along the way, we took a look at how each step was achieved in the interactive calculator built with PureScript, PureScript-Halogen, and Chart.js. To test out the calculator, we went through a made-up scenario where we had to both interpolate and extrapolate missing data points.

Now that you’ve seen how linear regression via gradient descent works, take a look at some other machine learning algorithms such as k-Nearest Neighbors and K-Means.