Problem

No one model works best for all possible situations. - No Free Lunch. DH Wolpert.

The Solution

Unit testable, dependency injectable, and backtestable models in under 200 lines of code.

Dependency Injection

Why? Dependency injection allows us to easily hot swap/inject both models and classifiers within our models in our machine learning pipeline.

Let's take a look at an end result example:

func main () { // SpamHamModel with a Naive Bayes Classifier plugged in spamNb := SpamHamModel { classifier : & NBClassifier {}} // Exact Same SpamHamModel with a SVM Classifier plugged in spamSVM := SpamHamModel { classifier : & SVMClassifier {}} // Exact Same SpamHamModel with a Neural Network Classifier plugged in spamNN := SpamHamModel { classifier : & NNClassifier {}} }

Thank you for your time thus far... Let's expand on this.

Data structures and interfaces

// Step 1 - Define the Structure of Input Data // You got mail! type Email struct { Author string Body string Flag string //Spam/Ham } // Step 2 - Define a ML Classifier Contract // Binary Classifier Interface - Examples SVM, NN, NB type Classifier interface { Learn ( emails [] Email ) Predict ( email Email ) string } // Step 3 - Create a Model with a "Plug & Play" Classifer Field // You got mail! - Is it Spam or Ham? (Model Example) type SpamHamModel struct { Classifier Classifier } func ( model * SpamHamModel ) Learn ( emails [] Email ) { model . Classifier . Learn ( emails ) } func ( model * SpamHamModel ) Predict ( email Email ) string { return model . Classifier . Predict ( email ) }

Naive Bayes Classifier

// Step 4 - Implement Classifier(s). Per the Contracts Terms. // You got mail! - Is it Spam or Ham? (Model's Brain/Classifier) type NBClassifer struct { classifier * bayesian . Classifier output [] bayesian . Class } func ( c * NBClassifier ) Learn ( emails [] models . Email ) { c . output = distinctFlags ( emails ) c . classifier = bayesian . NewClassifierTfIdf ( c . output ... ) for i := 0 ; i < len ( emails ); i ++ { c . classifier . Learn ( strings . Split ( emails [ i ] . Body , " " ), bayesian . Class ( emails [ i ] . Flag )) } c . classifier . ConvertTermsFreqToTfIdf () } func ( c * NBClassifier ) Predict ( email models . Email ) string { scores , _ , _ := c . classifier . LogScores ( strings . Split ( email . Body , " " )) results := models . Results {} for i := 0 ; i < len ( scores ); i ++ { results = append ( results , models . Result { ID : i , Score : scores [ i ]}) } sort . Sort ( sort . Reverse ( results )) flags := [] string {} for i := 0 ; i < len ( results ); i ++ { flags = append ( flags , string ( c . output [ results [ i ] . ID ])) } return flags [ 0 ] } func distinctFlags ( emails [] models . Email ) [] bayesian . Class { result := [] bayesian . Class {} j := 0 for i := 0 ; i < len ( emails ); i ++ { for j = 0 ; j < len ( result ); j ++ { if emails [ i ] . Flag == string ( result [ j ]) { break } } if j == len ( result ) { result = append ( result , bayesian . Class ( emails [ i ] . Flag )) } } return result }

Unit Testing

Using interfaces like this for crucial production code pieces allows for easier adherence to development approaches like TDD.

Example:

func CreateTrainingEmails () [] models . Email { return [] models . Email { models . Email { Body : "opportunity to earn extra money" , Flag : "Spam" }, models . Email { Body : "druggists blame classy gentry Aladdin" , Flag : "Spam" }, models . Email { Body : "please take a look at this report" , Flag : "Ham" }, models . Email { Body : "lunch at noon?" , Flag : "Ham" }, } } func CreateValidationEmails () [] models . Email { return [] models . Email { models . Email { Body : "opportunity to earn extra money" , Flag : "Spam" }, models . Email { Body : "druggists blame classy gentry Aladdin" , Flag : "Spam" }, models . Email { Body : "please take a look at this report" , Flag : "Ham" }, models . Email { Body : "lunch at noon?" , Flag : "Ham" }, } } func TestLearn ( t * testing . T ) { nbModel := models . SpamHamModel { Classifier : & NBClassifier {}} trainingSet := CreateTrainingEmails () validationSet := CreateValidationEmails () nbModel . Learn ( trainingSet ) for i := 0 ; i < len ( validationSet ); i ++ { input := validationSet [ i ] . Body expected := validationSet [ i ] . Flag actual := nbModel . Predict ( validationSet [ i ]) Assert ( t , expected , actual , input ) } } func Assert ( t * testing . T , expected string , actual string , input string ) { if actual != expected { t . Error ( "

FOR: " , input , "

EXPECTED: " , expected , "

ACTUAL: " , actual , ) } }

Backtesting(Stay tuned for part two...)

The code is also available on github (if you want to test it locally): https://github.com/heupr/resources/tree/master/plugnplay