Statistics Summary Calculator

Discrete Distrubutions

Distribution

Formula

Mean μ

Variance σ²

Moment Generating Function

Bernoulli Distribution

p if k = 1, 1 - p if k = 0, else 0

p(1 - p)

1 - p + pe^t

Binomial Distribution

f(k;n,p) =	n! * p^k(1 - p)^{n - k}
	k!(n - k)!

np(1 - p)

(1 - p + pe^t)ⁿ

Negative Binomial Distribution

f(n;k,p) =	(n - 1)! * p^k(1 - p)^{n - k}
	(k - 1)!(n - k)!

k
p

k(1 - p)
p²

(1 - p)^r
(1 - pe^t)^r

Geometric Distribution

P(x = n) = p * (1 - p)^{(n - 1)}

1
p

1 - p
p²

pe^t
1 - (1 - p)e^t

Poisson Distribution

P(k; λ) =	λ^k
	e^λk!

e^{λ(e^t - 1)}

Hypergeometric Distribution

P(x;n,N,k) =	(_kC_x) * (_{N - k}C_{n - x})
	_NC_n

nk
N

nk(N - k)(N - n)
N²(N - 1)

N/A

Multinomial Distribution

ƒ(x₀!·x₁!·x₂...·_n;x;n,θ₀,θ₁,θ₂,...,θ_n) =	n!(θ₀ · θ₁ · θ₂ · ... · θ_n)
	x₀ · x₁ · x₂ · ... · x_n

np_i

np_{i(1 - p_i)}

(Σp_ie^t_i)ⁿ

Uniform Distribution

P(k; λ) =	λ^k
	e^λk!

½a + b

(b - a)²
12

e^tb - e^ta
t(b - a)

Test Statistic Decision Tree:

Type

Keywords

Sample Size

Use Test

Z-score

Mean, Average

Greater than 30 or population σ is known

z =	X - μ
	σ/√n

t-score

Mean, Average

30 or less: population σ is not known

t =	x - μ
	s/√n

proportion-score

Proportion (Test p), Fraction, Percentage, Rate, Probability

more than 30

z =	p^ - p₀
	√p₀q₀/n

Variance σ²

Variance, Variability, Spread

N/A

χ² =	(n - 1)s²
	σ²

Equal Variances σ₁² = σ₂²

Equal Variances, Ratio or Difference in Variances

N/A

F =	σ₁²
	σ₂²

Confidence Interval Decision Tree

Type	Keywords	Sample Size	Use Test
Test the Mean	Confidence Interval, Mean, Average	Greater than 30	X - zscore_α/2 * s/√n < μ < X + zscore_α/2 * s/√n
Test the Mean	Confidence Interval, Mean, Average	30 or less	X - tscore_α * s/√n < μ < X + tscore_α * s/√n
Test the Variance	Confidence Interval, Variance	Greater than 30	(n - 1)s²/χ²_α/2 < σ² < (n - 1)s²/χ²_{1 - α/2}
Test the Standard Deviation	Confidence Interval, Standard Deviation	Greater than 30	Square Root((n - 1)s²/χ²_α/2) < σ² < Square Root((n - 1)s²/χ²_{1 - α/2})
Test the Proportion	Confidence Interval, Proportion, percentage, rate, Population	Greater than 30	(n - 1)s²/χ²_α/2 < σ² < (n - 1)s²/χ²_{1 - α/2}
Test the Difference of Means	Confidence Interval, Difference of Means	Greater than 30	(x₁ - x₂) - zscore_α x √a < μ₁ - μ₂ < (x₁ - x₂) - zscore_α x √a
Test the Difference of Means	Confidence Interval, Difference of Means	30 or less	(x₁ - x₂) - tscore_α x √a < μ₁ - μ₂ < (x₁ - x₂) - tscore_α x √a
p^ Confidence Interval	Confidence Interval (test p), criteria,characteristic, proportion	30 or less	p^ - z_α/2σ√p(1 - p)/n < p < p^ + z_α/2√p(1 - p)/n

Sample Size Decision Tree:

Type

Keywords

Use Test

Sample Size for μ

Sample Size, average, mean

n =	Z-score_α/2² x σ²
	SE²

Proportion Sample Size

Sample Size, Proportion, Population, Percentage, Rate

n =	Z-score² x p x (1 - p)
	SE²

μ₁ - μ₂ Sample Size

Sample Size, Difference of Means, μ₁ - μ₂

n =	Z-score²(σ₁² + σ₂²)
	ME²

p₁ - p₂ Sample Size

Sample Size, Difference of p, p₁ - p₂

n =	Z-score²(p₁q₁ + p₂q₂)
	ME²

Hypothesis Testing Decision Tree

p-value Significance Test (observed level of significance):

Find your z-score, then find the probability in the z-table associated with that score, and if α > probability (p-value), reject H₀

Hypothesis Testing Errors:

Type I error - Reject null hypothesis H₀ when H₀ is TRUE: Probability = α
Type II error - Accept null hypothesis H₀ when H₀ is FALSE: Probability = β
Power of the Test = Probability you Reject null hypothesis H₀ when H₀ is FALSE: --> 1 - β
Note: It is a bigger mistake to make a Type II error than a Type I error

Finite Population Correction Factor:

If n/N > 0.05, then you multiply your confidence interval by the following factor

√N - n
√N

Regression Testing and Correlation Coefficients:

Cov(X,Y) =	Σ(X_i - X)(Y_i - Y)
	n

Correlation Coefficient (r) =	Cov(X,Y)
	s_xs_y

β =	Σ(X_i - X)(Y_i - Y)
	Σ(X_i - X)²

Least Squares Regression Line ← α = Y - βX
y^ = α + βx where α is the y-intercept for the line that contains the points in X & Y and β is the is the slope of the line that the set of points lies on.
α & β are designed such that they produce the smallest possible SSE defined below
Sum of Squares about the Mean (SSM) = (y_i - y)²
Square of the Residual Difference (SSE) (y_i - y^_i)²
SSE represents the difference between the straight line that we create and the plotted points from our data

Coefficient of Determination (r²) =	SSM - SSE
	SSM

Large Sample Condition Requirement:

1. A random sample is selected from the target population.
2. The sample size n is large (i.e., n ≥ 30). (Due to the Central Limit Theorem, this condition guarantees that the test statistic will be approximately normal regardless of the shape of the underlying probability distribution of the population.)