That would be enough to pay for 33.2k new boxes; used as part of another A/B test, it would provide extreme power by standard p-value testing , so we might wonder if further experimentation is profitable.

So because of the small possibility of a profitable box (which might be very profitable applied to all customers indefinitely), we’d be willing to pay up to a grand total of $10773 for certain information .

But in that case, our gain depends on how large the cancellation reduction is - if it’s exactly 2.17%, the gain is ~$0 so we are indifferent, but if the gain is 3% or 4% or even higher like 5%, then we would have been leaving real money on the table ($52k, $72k, & $93k respectively). Of course, each of those large gains is increasingly unlikely, so we need to go back to the posterior distribution to weight our gains per customer by averaging over the posterior distribution of possible reductions:

In 93% of the cases, we believe the answer would be ‘no’: the oracle is worth nothing since it could only tell us what we already believe (that the decrease is less than 2.26%), in which case we remain with the status quo profit of $0 and are no better off, so in those cases the oracle was worth $0; in the other 7% of the cases, the answer would be ‘yes’ and we would change our decision and make some expected profit. So the value of the oracle is $0 + expected-value-of-those-other-7%s.

This gets into the “ expected value of perfect information ” (EVPI): how valuable would be a definitive answer to the question of whether the decrease is better than 2.16%? Would we be willing to pay $10 or $100 or $1000 for an oracle’s answer?

Still, 7% is not negligible - there is still a chance that we are making a mistake in not using the new boxes, as we did get evidence suggesting the new boxes are better. Are the results favorable enough to justify additional A/B testing?

How much would n more observations be worth, the “expected value of sample information”? The EVSI must be smaller than the EVPI’s implied n=332645; we can estimate exactly how much smaller by repeatedly simulating drawing n more observations, doing a Bayesian update, recalculating our expected profits for both choices (status quo vs new box), deciding whether to switch, and recording our profit if we do switch. This can be used to plan a fixed-sample experiment by finding the value of n which maximizes the EVSI: if n is too small (eg n=1), it doesn’t affect our decision, but if n is too big (n=100,000) it is overkill.

First, we can automate the posterior & profit analysis like so:

posterior <- function (x, n) { betaPosterior (x,n, prior1.a= 900 , prior1.b= 1407 ) $ Theta_diff } posteriorProfit <- function (x, n) { posteriorReduction <- posterior (x, n) gains <- sapply (posteriorReduction, function (r) { improvementTotalGain ( 5.7 , 0.3909 , r, 344 * 12 , 0.10 ) - 36732 }) return ( list ( Profit= gains, Reduction= posteriorReduction)) } gains <- posteriorProfit ( x= c ( 175 , 168 ), n= c ( 442 , 439 )) mean (gains $ Profit) # [1] -33019.54686

So with the current data, we would suffer an expected loss of $33k by switching to the new box.

It is easy to simulate collecting another datapoint since it’s binary data without any covariates or anything: draw a possible cancellation probability from the posterior distribution for that group, and then flip a coin with the new probability.

simulateData <- function (posterior) { rbinom ( 1 , 1 , prob= sample (posterior, 1 )) }

Now we can repeatedly simulate fake data, add it to the real data, rerun the analysis, see what the new estimated profit is from the best action (usually we will conclude what we already conclude, that the box is not worthwhile and the value of the new information is then $0 - but in some possible universes the new data will change our decision), compare the new estimated profit against the old profit, and thus whether the increase in profit resulting from that new datapoint justifies the cost of the new datapoints.

library (parallel) library (plyr) evsiEstimate <- function (x, n, n_additional, iters= 1000 ) { originalPosterior <- betaPosterior (x, n, prior1.a= 900 , prior1.b= 1407 ) gains <- posteriorProfit ( x= x, n= n) oldProfit <- mean (gains $ Profit) evsis <- unlist ( mclapply ( 1 : iters, ## parallelize function (i) { ## draw a set of hypothetical parameters from the posterior controlP <- sample (originalPosterior $ Theta_ 1 , 1 ) experimentalP <- sample (originalPosterior $ Theta_ 2 , 1 ) ## simulate the collection of additional data control <- replicate (n_additional, simulateData (controlP)) experimental <- replicate (n_additional, simulateData (experimentalP)) ## the old box profit is 0; what's the estimated profit of new boxes given additional data? simGains <- posteriorProfit ( x= c (x[ 1 ] + sum (control), x[ 2 ] + sum (experimental)), n= c (n[ 1 ] + n_additional, n[ 2 ] + n_additional)) newBoxProfit <- mean (simGains $ Profit) oldBoxProfit <- 0 ## choose the maximum of the two actions: evsi <- max ( c (newBoxProfit, oldBoxProfit)) return (evsi) } ) ) return ( mean (evsis)) } ## Example EVSI estimates for various possible experiment sizes: evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 1 ) # [1] 0 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 100 ) # [1] 0 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 500 ) # [1] 0 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 1000 ) # [1] 4.179743603 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 2000 ) # [1] 33.58719093 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 3000 ) # [1] 152.0107205 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 4000 ) # [1] 259.4423937 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 5000 ) # [1] 305.9021146 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 6000 ) # [1] 270.1474476 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 7000 ) # [1] 396.8461236 evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= 8000 ) # [1] 442.0281358 ## Search for _n_ maximizing the EVSI minus the cost of the samples ("Expected Net Benefit"/ENBS) optimize ( function (n) { evsiEstimate ( c ( 175 , 168 ), c ( 442 , 439 ), n_additional= n, iters= 5000 ) - 0.33 * n; }, interval= c ( 1 , 20000 ), maximum= TRUE , tol= 1 ) # $maximum # [1] 1.483752763 # $objective # [1] -0.4896384119

EVSI exhibits an interesting behavior in that decisions are discrete, so unlike one might intuitively expect, the EVSI of eg n=1-100 can be zero but the EVSI of n=1000 can suddenly be large. Typically an EVSI curve will be zero (and hence expected profit increasingly negative) for small sample sizes where the data cannot possibly change one’s decision no matter how positive it looks, and then when it does become ample enough to affect the decision, becomes increasingly valuable until a peak is reached and then diminishing returns sets in and it eventually stops improving noticeably (while the cost continues to increase linearly).

In this case, the end result of the CJ experiment is that no fixed-sample extension is worthwhile as the EVSI remains less than the cost of the sample for all natural numbers.