Commonly used eight probability distributions and their realization

1. Uniform distribution

1.1 Discrete uniform distribution

1.2 Continuous uniform distribution

1.3 python code

import numpy as np  
import matplotlib.pyplot as plt 
from scipy import stats 
 
# for continuous  
a = 0 
b = 50 
size = 5000 
 
X_continuous = np.linspace(a, b, size) 
continuous_uniform = stats.uniform(loc=a, scale=b) 
continuous_uniform_pdf = continuous_uniform.pdf(X_continuous) 
 
# for discrete 
X_discrete = np.arange(1, 7) 
discrete_uniform = stats.randint(1, 7) 
discrete_uniform_pmf = discrete_uniform.pmf(X_discrete)  
 
# plot both tables 
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(15,5)) 
# discrete plot 
ax[0].bar(X_discrete, discrete_uniform_pmf) 
ax[0].set_xlabel("X") 
ax[0].set_ylabel("Probability") 
ax[0].set_title("Discrete Uniform Distribution") 
# continuous plot 
ax[1].plot(X_continuous, continuous_uniform_pdf) 
ax[1].set_xlabel("X") 
ax[1].set_ylabel("Probability") 
ax[1].set_title("Continuous Uniform Distribution") 
plt.show()

2. Gaussian distribution/normal distribution

σ is the standard deviation and μ is the mean of the distribution. Note that in a normal distribution, the mean, mode, and median are all equal.

python code

mu = 0 
variance = 1 
sigma = np.sqrt(variance) 
x = np.linspace(mu - 3*sigma, mu + 3*sigma, 100) 
 
plt.subplots(figsize=(8, 5)) 
plt.plot(x, stats.norm.pdf(x, mu, sigma)) 
plt.title("Normal Distribution") 
plt.show()

3. Lognormal distribution

The lognormal distribution is a continuous probability distribution for a random variable whose logarithm is normally distributed. Therefore, if the random variable X is lognormally distributed, then Y = ln(X) has a normal distribution.

python code

X = np.linspace(0, 6, 500) 
 
std = 1 
mean = 0 
lognorm_distribution = stats.lognorm([std], loc=mean) 
lognorm_distribution_pdf = lognorm_distribution.pdf(X) 
 
fig, ax = plt.subplots(figsize=(8, 5)) 
plt.plot(X, lognorm_distribution_pdf, label="μ=0, σ=1") 
ax.set_xticks(np.arange(min(X), max(X))) 
 
std = 0.5 
mean = 0 
lognorm_distribution = stats.lognorm([std], loc=mean) 
lognorm_distribution_pdf = lognorm_distribution.pdf(X) 
plt.plot(X, lognorm_distribution_pdf, label="μ=0, σ=0.5") 
 
std = 1.5 
mean = 1 
lognorm_distribution = stats.lognorm([std], loc=mean) 
lognorm_distribution_pdf = lognorm_distribution.pdf(X) 
plt.plot(X, lognorm_distribution_pdf, label="μ=1, σ=1.5") 
 
plt.title("Lognormal Distribution") 
plt.legend() 
plt.show()

4. Poisson distribution

The Poisson distribution is used to show the number of times an event is likely to occur within a specified period of time.

λ is the event rate in one unit of time and k is the number of occurrences

python code

from scipy import stats 

print(stats.poisson.pmf(k=9, mu=3)) 
X = stats.poisson.rvs(mu=3, size=500) 
 
plt.subplots(figsize=(8, 5)) 
plt.hist(X, density=True, edgecolor="black") 
plt.title("Poisson Distribution") 
plt.show()

The Poisson distribution has a curve similar to a normal distribution, with λ representing the peak.

5. Exponential distribution

The exponential distribution is the probability distribution of the time between events in a Poisson point process. The probability density function of the exponential distribution is as follows:

λ is a rate parameter and x is a random variable.

python code

X = np.linspace(0, 5, 5000) 
 
exponetial_distribtuion = stats.expon.pdf(X, loc=0, scale=1) 
 
plt.subplots(figsize=(8,5)) 
plt.plot(X, exponetial_distribtuion) 
plt.title("Exponential Distribution") 
plt.show()

6. Binomial distribution

The binomial distribution can be thought of as the probability of success or failure in an experiment

  • P = binomial distribution probability

  • x = number of specific outcomes in n trials

  • p = probability of success in a single experiment

  • q = probability of failure in a single trial

  • n = number of experiments

python code

X = np.random.binomial(n=1, p=0.5, size=1000) 
 
plt.subplots(figsize=(8, 5)) 
plt.hist(X) 
plt.title("Binomial Distribution") 
plt.show()

7.t distribution

The t-distribution is any member of the family of continuous probability distributions that arise when estimating the mean of a normally distributed population when the sample size is small and the population standard deviation is unknown

n is a parameter called "degrees of freedom" which is sometimes seen called "dof" For higher values ​​of n the t-distribution is closer to a normal distribution.

python code

import seaborn as sns 
from scipy import stats 
 
X1 = stats.t.rvs(df=1, size=4) 
X2 = stats.t.rvs(df=3, size=4) 
X3 = stats.t.rvs(df=9, size=4) 
 
plt.subplots(figsize=(8,5)) 
sns.kdeplot(X1, label = "1 d.o.f") 
sns.kdeplot(X2, label = "3 d.o.f") 
sns.kdeplot(X3, label = "6 d.o.f") 
plt.title("Student's t distribution") 
plt.legend() 
plt.show()

8. Chi-square distribution

The basic formula of the chi-square test, that is, the calculation formula of χ2, is the deviation between the observed value and the theoretical value

A is the observed value, E is the theoretical value, k is the number of observed values, the last formula is actually the specific calculation method, n is the total frequency, p is the theoretical frequency, then n*p is naturally the theoretical frequency ( theoretical value)

python code

X = np.arange(0, 6, 0.25) 
 
plt.subplots(figsize=(8, 5)) 
plt.plot(X, stats.chi2.pdf(X, df=1), label="1 d.o.f") 
plt.plot(X, stats.chi2.pdf(X, df=2), label="2 d.o.f") 
plt.plot(X, stats.chi2.pdf(X, df=3), label="3 d.o.f") 
plt.title("Chi-squared Distribution") 
plt.legend() 
plt.show()

Guess you like

Origin blog.csdn.net/allein_STR/article/details/129507025