4 Probability Distributions

Random Variables

Discrete Random Variables

Continuous Random Variables

Permutations and Combinations

Permutations

Permutation is an arrangement of objects in order.

Total number of permutations of N objects = N! (N factorial)

Where

N! = 1 * 2 * 3 *...* (N-1) * N
0! = 1

If some of the N objects are similar, such as N₁ objects are alike, N₂ objects are alike … Nₖ objects are alike, and ΣNᵢ = N.

The total number of permutations of these N objects = N! / (N₁!N₂!...Nₖ!)

If only r objects can be taken in each permutation.

The total number of permutations of r objects of N objects = N! / (N - r)!

Its notion is ɴPᵣ, where ɴPɴ is full permutation, i.e. N!

Combinations

Combination represents number of ways of selecting r objects from N objects, irrespective of order. In contrast, ɴPᵣ selects r objects from N objects and the order matters.

The total number of combinations of r distinct of N objects = ɴPᵣ / ᵣPᵣ
                                                            = N! / (r!(N-r)!)

Its notion is ɴCᵣ reads as N choose r.

Sometimes the number of combinations is known as a binomial coefficient, and sometimes the notation ɴCᵣ is used.

Combination is also often represented using following notion:

combination_notion

The Binomial Distribution

Bernoulli Trial

Many experiments share the common element that their outcomes can be classified into one of two events, one can be labeled as “success” and the other as failure. A Bernoulli trial is each repetition of an experiment involving only 2 outcomes:

We are often interested in the result of independent, repeated Bernoulli trials, i.e. the number of successes in repeated trials.

Binomial Distribution

A binomial distribution gives us the probabilities associated with independent, repeated Bernoulli trials.

A binomial distribution describes the probabilities of those of

The probability of getting r successes in N independent trials with each having p success probability:

p(X = r; N, p) = number of ways event can occur * p(one occurrence)
               = ɴCᵣ * pʳ * (1 - p)⁽ᴺ⁻ʳ⁾

More formally, in sampling a stationary Bernoulli process, with the probability of success equal to p, the probability of observing exactly r successes in N independent trials is:

binomial_dist_pdf

Another way of defining binomial distribution:

Mean and Variance

Mean:
E(Xᵢ) = Σxᵢp(xᵢ)
      = 0 * (1 - p) + 1 * p
      = p
E(X) = E(X₁ + X₂ ... + Xɴ)
     = E(X₁) + E(X₂) + ...+ E(Xɴ)
     = Np

Variance:
xᵢ = xᵢ²
V(xᵢ) = E(xᵢ²) - E(xᵢ)²
      = p - p²
      = p(1 - p)
      = pq
V(X) = V(X₁) + V(X₂) ... + V(Xɴ)
     = Npq

Shape

Examples

In a family of 11 children, what is the probability that there will be more boys than girls?
Solve this problem WITHOUT using the complements rule.

Solution:
p(boy) = 0.5
N = 11
p(more boys than girls) = p(6, N, p(boy)) + p(7, N, p(boy)) ... + p(11, N, p(boy))
                        = 0.2256 + 0.1611 + 0.0806 + 0.0269 + 0.0054 + 0.0005
                        = 0.5

The Normal Distribution

Properties

Applications

Usage in time series:

Rules

Examples

Below are some examples of how to use standardized scores to address various questions.

Example 1

The top 5% of applicants (as measured by GRE scores) will receive scholarships.
If GRE ~ N(500, 100²), what is the GRE score to qualify for a scholarship?

Solution:
Let X = GRE, want to find x such that p(X ≥ x) = 0.05
Let Z = (X - 500) / 100 ~ N(0, 1)
For p(Z ≥ z) = 0.05, z ≈ 1.65
x = (z * 100) + 500 = 665

Example 2

Family income ~ N(25000, 10000²).
If the poverty level is $10,000, what percentage of the population lives in poverty?

Solution:
Let X = family income, want to find p(X ≤ 10000).
Let Z = (X - 25000) / 10000 ~ N(0, 1)
z = (10000 - 25000) / 10000 = -1.5
p(Z ≤ -1.5) = 1 - p(Z ≤ 1.5)
            = 1 - 0.9332
            = 0.0668

Example 3

A new tax law is expected to benefit “middle income” families, those with incomes between
$20,000 and $30,000. If Family income ~ N(25000, 10000²), what percentage of the population
will benefit from the law?

Solution:
Let X = family income, want to find p(20000 ≤ X ≤ 30000)
Let Z = (X - 25000) / 10000 ~ N(0, 1)
z₀ = (20000 - 25000) / 10000 = -0.5
z₁ = (30000 - 25000) / 10000 = 0.5
p(20000 ≤ X ≤ 30000) = p(-0.5 ≤ Z ≤ 0.5)
                     = 2𝐹(0.5) - 1
                     = 1.383 - 1
                     = 0.383

Approximating the Binomial Distribution

For a large enough N, a binomial variable X is approximately ~N(Np, Npq). The normal distribution can be used to approximate the binomial distribution.

The Poisson Distribution

The Poisson distribution models the number of times an event happens in a fixed interval of time or space when events occur independently at a constant average rate. Examples of Poisson random variable:

Properties

Example

Let X equal the number of typos on a printed page with a mean of 3 typos per page.
What is the probability that a randomly selected page has at least 1 typo on it?

Solution:
p(X ≥ 1) = 1 - p(X = 0)
         = 1 - e⁻³3⁰ / 0!
         = 1 - e⁻³
         = 0.9502

What is the probability that a randomly selected page has at most 1 typo on it?
Solution:
p(X ≤ 1) = p(X = 0) + p(X = 1)
         = e⁻³3⁰ / 0! + e⁻³3¹ / 1!
         = e⁻³ + 3e⁻³
         = 0.1992

Approximating the Binomial Distribution

The Poisson distribution can be viewed as the limit of binomial distribution. Suppose X ~ Binomial(N, λ/N) where N is very large and λ/N is very small. We show that the PMF of X can be approximated by the PMF of a Poisson(λ).

poisson_binomial_proof

In the screenshot, n is the Binomial distribution parameter N. λ is the Poisson distribution parameter. λ/N is the Binomial distribution parameter p. The k is the fixed value for Poisson random variable.

An intuitive understanding is that when N becomes larger, the Poisson interval is divided into N smaller sub-intervals (λ/N). The the sub-interval becomes sufficiently small, it can guarantee only one event happens in each sub-interval. If we regard an event occurrence in a sub-interval as a “success” in binomial distribution, then the following 2 probabilities are equivalent:

This is useful because Poisson PMF is much easier to compute than the binomial.