This post is an evolving set of ideas to quantify trust in decision making systems and processes. For me, an empirical definition of trust is the relative distance of a decision from the ground truth. Quantifying and optimizing trust in decision making systems are therefore highly important. This process will make systems that are involved in decision making processes to perform consistently and closer to the future reality.

The first step to optimizing such systems, human or computational, will be to develop an algorithmic approach to quantify and optimize trust.

The first experimentation is using a measurement of distance from the center. The idea here is, as overall trustworthiness of a decision making system improves overtime, the system has a very short distance from the mean. Also, patterns of delineation between systems that consistently lag behind in real world prediction problems can be easily identified.

k-centroids clustering analysis example data("Nclus") plot(Nclus) ## try kmeans cl1 = kcca(Nclus, k=4) cl1 image(cl1) points(Nclus) ## A barplot of the centroids barplot(cl1) ## now use k-medians and kmeans++ initialization, cluster centroids ## should be similar... cl2 = kcca(Nclus, k=4, family=kccaFamily("kmedians"), control=list(initcent="kmeanspp")) cl2 ## ... but the boundaries of the partitions have a different shape image(cl2) points(Nclus) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 data ( "Nclus" ) plot ( Nclus ) ## try kmeans cl1 = kcca ( Nclus , k = 4 ) cl1 image ( cl1 ) points ( Nclus ) ## A barplot of the centroids barplot ( cl1 ) ## now use k-medians and kmeans++ initialization, cluster centroids ## should be similar... cl2 = kcca ( Nclus , k = 4 , family = kccaFamily ( "kmedians" ) , control = list ( initcent = "kmeanspp" ) ) cl2 ## ... but the boundaries of the partitions have a different shape image ( cl2 ) points ( Nclus )

The code above is an example I pulled from CRAN for starting an experimentation with k-centroids cluster analysis.

Another approach to quantification of decision systems is to use log-loss. Log-loss is very interesting because of the increased penalty for systems that are very far off from the ground reality.

Here is a simple implementation of the log-loss function. But this function has a series of downsides, which I will discuss below.

Log loss function LogLossFunction = function(actual, predicted, eps = 1e-15) { predicted = pmin(pmax(predicted, eps), 1-eps) - (sum(actual * log(predicted) + (1 - actual) * log(1 - predicted))) / length(actual) } 1 2 3 4 LogLossFunction = function ( actual , predicted , eps = 1e - 15 ) { predicted = pmin ( pmax ( predicted , eps ) , 1 - eps ) - ( sum ( actual * log ( predicted ) + ( 1 - actual ) * log ( 1 - predicted ) ) ) / length ( actual ) }

The function log(1-predicted), is the function I am wary of. What if the algorithm used for making predictions return a value greater than 1? For most applications, a simple specification of the range between 0 and 1 for the prediction values will fix the issue of >1 values. But, there are circumstances where >1 values are needed as outputs for prediction problems. An excellent scenario is regression problems using machine learning.

In regression problems, there is no real solution to identify whether a probability function returning a value slightly higher than 1, when plugged into another equation, matches the real observation or not. To address this issue of handling >1 probability values, I have modified the code to include an absolute function. This will prevent the log function to return imaginary (i) values or in most programming environments: NaN values. The modified code is included below:

Log Loss for greater than 1 probability functions LogLossFunction = function(actual, predicted, eps = 1e-15) { predicted = pmin(pmax(predicted, eps), 1-eps) - (sum(actual * log(predicted) + (1 - actual) * log(abs(1 - predicted)))) / length(actual) } 1 2 3 4 LogLossFunction = function ( actual , predicted , eps = 1e - 15 ) { predicted = pmin ( pmax ( predicted , eps ) , 1 - eps ) - ( sum ( actual * log ( predicted ) + ( 1 - actual ) * log ( abs ( 1 - predicted ) ) ) ) / length ( actual ) }

The quantification of trust in decision processes, especially the artificial intelligence systems are important. I visualize AI systems as very similar to constructing a bridge across a deep ravine with a river flowing at break-neck speeds.

If someone builds a rickety rope bridge (very low trust scores), people have the intuition to not use the bridge to cross the ravine. On the other hand, when we build a strong steel suspension bridge with a service lifespan of 300 years and a load carrying capacity way higher than anything currently imaginable (very high trust scores), folks will use the bridge without ever thinking about the risks. The reason is quite simple: the statistical probability of the well engineered steel suspension bridge failing is very close to zero.

But, the problem for AI systems currently is: there is no straight forward and intuitive solutions to quantify the trust worthiness of these systems. The metrics that I am trying to develop, will help visualize and quantitate the trust worthiness of AI systems. It is very similar to human cognitive approach to the bridge crossing problem, but applied for AI and decision systems.

Note: Evolving content with changes in the post as I add more content.

This work is done as part of our startup project nanoveda. For continuing nanoveda’s wonderful work, we are running a crowdfunding campaign using gofundme’s awesome platform. Donation or not, please share our crowdfunding campaign and support our cause.

Donate here: gofundme page for nanoveda.

(The image of “Mother and Child, 1921” by Pablo Picasso, Spanish, worked in France, 1881–1973, from Art Institute of Chicago and published under fair use rights.

© 2016 Estate of Pablo Picasso / Artists Rights Society (ARS), New York,

The image of island rope bridge, Sa Pa, Vietnam, is an edited version of a public domain photograph obtained through Google image search. )