roadnottaken has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am having a mysterious problem that I can't figure out. I am using Statistics::TTest to calculate p-values for several thousand pairs of distributions of numbers. I'm using these p-values to create a volcano plot, and when I plot the p-values I observe a strange artifact where many of the points are getting the same p-value. After some sleuthing, I can re-create the phenomenon with the four pairs of numbers in the code below. When I calculate p-values for these pairs in Excel, the values are all quite distinct (different by several orders of magnitude) but using Statistics::TTest I'm getting EXACTLY the same value for each pair.

The p-values are very small (around 1.6e-12) so I wonder if this ins't some sort of precision issue... but I can't figure it out. If you run the code below, it will display four identical p-values (T-test probabilities, "t_prob") despite the fact that the true p-values range from 1.7e-19 to 2.8e-29.

I have tried doing something similar using Statistics::Distributions but I had the same problem, however I think Statistics::TTest relies on Statistics::Distributions for these calculations. I was not able to find any other modules that perform this calculation.

I should note that the vast majority (99%) of pairs of distributions are getting correct p-values, it's just a handful that produce this strange collision on a wrong value.

Does anyone out there have any insight into what's going wrong? Help is very much appreciated, thanks!