Marketing data has a very special shape distinguishing it from other data types. It has unique data properties, especially when we talk about the target action of the marketing campaign, which has very unbalanced results. Things can become more complex, considering the goal of the Ubex project to deliver only advertisement customers are interested in.

Our previous article covered the first steps of our Data Scientists on building the neural network model. It was on “Bank Marketing Dataset” from “Center for Machine Learning and Intelligent Systems” (https://archive.ics.uci.edu/ml/datasets/bank+marketing) using Deep Neural Network. As shown in the article, the final result has an accuracy of about 91.6%. But is that result good enough, especially in terms of the marketing goals of the project? And can we actually consider the overall accuracy as a measure of the model prediction capabilities? Let us find it out today.

As we already mentioned, marketing data has very unbalanced results. It means that for any 10 hits we have only 1 target action. For example, if we will show the advertisement banner the visitors will click it only once for 10 views. So, if our prediction model will choose negative results, and for all cases, the model will have an overall accuracy of about 90%. It means that we cannot consider overall accuracy as a good measure of the model prediction capabilities. Moreover, taking into consideration the goal of the Ubex project, it is better to predict even a part of true positive results, meaning we can show only the ads that the visitor is definitely interested in. So, it is more important not to show spam ads (that the visitor will not be interested in) than show a possibly interesting ad.

Let us describe it technically. To describe binary classification accuracy, engineers often use an AUC (or actually AUROC) measure meaning the Area Under the Curve. To get the meaning of that metric, the concept of confusion matrix must be understood first. In binary prediction (yes/no), we can have 4 types of possible outcomes:

True Positive (TP) — we correctly predicted that the result class is positive (yes). For example, we predicted that the visitor will interact with an ad and he actually did it.

(TP) — we correctly predicted that the result class is positive (yes). For example, we predicted that the visitor will interact with an ad and he actually did it. False Positive (FP) — we incorrectly predicted that the result class is positive (yes). For example, we predicted that the visitor will interact with an ad, but he did not do it.

(FP) — we incorrectly predicted that the result class is positive (yes). For example, we predicted that the visitor will interact with an ad, but he did not do it. True Negative (TN) — we correctly predicted that the result class is negative (no). For example, we predicted that the visitor will not interact with an ad and he actually did not interact with it.

(TN) — we correctly predicted that the result class is negative (no). For example, we predicted that the visitor will not interact with an ad and he actually did not interact with it. False Negative (FN) — we incorrectly predicted that the result class is negative (no). For example, we predicted that the visitor will not interact with an ad, but he actually did it.

To get the confusion matrix, we go over all the predictions made by the model and count how many times each of those 4 types of outcomes occur:

It is often more convenient to have a single metric, rather than several. The next two metrics must be introduced first:

True Positive Rate (TPR) also known as sensitivity, hit-rate or recall is defined as TP / (TP + FN). Intuitively this metric corresponds to the proportion of positive data points that are correctly considered as positive, with respect to all positive data points.

(TPR) also known as sensitivity, hit-rate or recall is defined as TP / (TP + FN). Intuitively this metric corresponds to the proportion of positive data points that are correctly considered as positive, with respect to all positive data points. False Positive Rate (FPR) also known as fall-off is defined as FP / (FP + TN). Intuitively this metric corresponds to the proportion of negative data points that are mistakenly considered as positive, with respect to all negative data points.

To get one single metric, we first compute the two former metrics with many different thresholds for the logistic regression. By drawing it on one single graph, we can get a line describing the correspondence of one metric to another:

Getting the area under that curve will give us single AUC metric. The absolutely random classifier will give us 0.5 AUC, the straight line from (0,0) to (1,1). The perfect predictor with 100% accuracy will give us an AUC equal to 1. The physical meaning of that curve is that increasing the rate of correctly predicted positive results will affect as the correspondent increase of the rate of incorrectly predicted positive results. To get a good model, it is important to find a balance.

Taking into consideration this single metric, the AUC of the previously obtained results using Deep Neural Networks gave us a value around 0.7. Despite visibly good accuracy, the actual prediction possibilities of the model weren’t so high. So, our Data Science team went deeper into research trying different models.

First, the Ubex Data Scientists tried Factorization Machines, Field-Aware Factorization Machines, xDeepFm and some other models. The best AUC we can get was around 0.76. Some of those models were a little better observed in the Deep Neural Network model, but most of them still have poor results in terms of ROC curve.

Then the engineers look to the Gradient Boosting models. At the very first attempt, it shows better results giving more than 0.8 AUC. The input data still was imbalanced, so getting the proper input could give us even better results. To balance the input, our team applied the following lambda function:

sample_weight_proc = lambda x_, y_: class_weights.get(y_).values

, where class_weights are the pandas Series with appropriate weights of the classes.

As far as the marketing data has 1 positive result on 10 hits, it is good to choose the following weights:

0 (no) → 1.0

1 (yes) → 9.0

Finally, using the LightGBM library with the above weight lambda function, the Ubex AI Team can obtain the model with an AUC more than 0.9. It is an impressive result, especially compared to the previous model based on the Deep Neural Network.

When having such a good Data Model, it is just a matter of time to get the same results on real marketing data. At the moment, our AI Team is working close along with our partner, who provides Ubex with real marketing data. Ubex already has good preliminary results, but it is a little early to show them publically.

Stay tuned to our newsletter to get the results on cutting-edge technologies based on Artificial Intelligence. Feel free to browse our GitHub, where you can find the source code of the models observed here.

If you would like to know more about the Ubex project, visit our website.

Write by Ubex CTO Dan Gartman

About Ubex

Ubex is the solution — a global, decentralized advertising exchange based on the fusion of Neural Networks, AI and blockchain operated by smart contracts. The mission of Ubex is to create a global advertising ecosystem with a high level of trust and maximum efficiency. With Ubex, the process of acquiring advertising slots and selecting the most effective websites for placement is simplified and transaction risks are minimized.

Join our Telegram group or other social media to stay updated.

Website • Telegram • Facebook • Twitter • LinkedIn • BitcoinTalk•Reddit • Instagram • YouTube• GitHub