

The very first thing a statistician wants to know before diving deep into some analysis is the distribution of given data-set.



How study of distribution of any data-set becomes so dominantly prerequisite for any statistical analysis?



The answer is simple!



For example, if your data-set is Normally distributed, you can easily know what's next using it's different parameters such as Mean and Standard Deviation.



68% of the observations are within one Standard Deviations of the Mean, 95% are within two Standard Deviations, and 99.7% are within three Standard Deviations.



The problem is, in this diverse world, we have diverse data-set too!



Not every data-set can have symmetric distribution with a characteristic of 'bell' shape. Additionally Normal Distribution describes only about continuous data. What if our data is binary or discrete? Well, in this case, we have other distributions e.g. Binomial Distribution, Poisson Distribution.





So, what makes Poisson Distribution unique in the crowd?



- The Poisson Distribution represents discrete counts of occurrences in a continuous time/number of events.

- It is not symmetrical; it is skewed toward the infinity end.





Well, leave any statistical term for a moment! Let's jump into a real world!



- Number of earthquake in a given period of time

- Number of birth/death in a given period of time

- Number of marriages/divorces in a given period of time

- Number of suicides in a given period of time

- Number of heart-attack patient arrival in clinic in a given period of time

- Number of bankruptcies that are filed in a given period of time

- Number of bugs per byte of code



What are the most common characteristics have you observed in the given example?

1. "given period of time"

2. The event seems to be very rare, isn't it?





Long story short, Poisson Distribution is a



Law of rare events!

or

When the number of events tend to infinite and the probability tends to zero!





The connection between the Poisson and Binomial Distributions



- The Poisson Distribution is a limiting case of a Binomial Distribution when the number of trials, n, gets very large and p, the probability of success, is very small.



- Binomial Probability of being x successes in n trials



P(x) = nCx p^x q^(n−x)



p = μ/n

q = 1 − μ/n



P(x) = nCx (μ/n)^x (1−μ/n)^(n−x)

i.e.

lim n→∞ P(x)=e^(−μ) μ^x / x!





In short,



P(x; μ) = (e^(-μ) * μ^(x)) / x!



Where,

μ = Average number of events

e = Euler's constant ~ 2.71828

x = Number of success for the event





Example:



- In an automobile manufacturing company, the probability that an electric motor is defective is 0.02. What is the probability that a sample of 400 electric motors will contain exactly 5 defective motors?



The average number of defectives in 400 motors is μ = 0.02 × 400 = 8



The probability of getting 5 defectives is:



P(X)= ((e^(-8))*8^5)/5!

= 0.00033546262 * 32768 / 120

= 0.09160





So, like Normal Distribution, what are the Mean and Variance of Poisson Distribution?



If μ is the average number of successes occurring in a given time interval then the Mean and the Variance of the Poisson distribution are both equal to μ.



Mean = μ

and

Variance = σ^2 = μ



From the above discussion, it can be seen that in Poisson distribution, only one parameter, μ is needed to determine the probability of an event.





The Poisson Distribution is applicable where:



- The event is something that can be counted in whole numbers or discrete.

- The occurrences of events are independent.

- The average frequency of occurrence for the time period is known.

- It is possible to count how many events have been occurred in a given period of time.





