More about Data Science and Statistics
- Normality Tests
- Equality of variances
- Linear Correlation
- T-test
- ANOVA
- Permutation tests
- Bootstrapping
- Fitting probability distributions
- Kernel Density Estimation (KDE)
- Kolmogorov-Smirnov Test
- Cramer-Von Mises Test
Introduction¶
Normality tests, also called normality contrasts, aim to analyze whether the available data could come from a population with a normal distribution. There are three main strategies to approach this analysis:
Graphical representations
Analytical methods
Hypothesis tests
One of the most commonly used examples when discussing random variables that follow a normal distribution is human height. This statement is not arbitrary; processes whose result is the sum of many small interactions tend to converge to a normal distribution. A person's height is the result of thousands of factors that add to each other, conditioning growth.
Throughout this document, we show how to use different strategies to determine whether the height of a group of people follows a normal distribution.
Libraries¶
The libraries used in this document are:
# Data processing
# ==============================================================================
import pandas as pd
import numpy as np
# Graphics
# ==============================================================================
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
plt.rcParams.update({'font.size': 10})
# Preprocessing and analysis
# ==============================================================================
import statsmodels.api as sm
from scipy import stats
# Warnings configuration
# ==============================================================================
import warnings
warnings.filterwarnings('once')
Data¶
The data used in this example have been obtained from the book Statistical Rethinking by Richard McElreath. The dataset contains information collected by Nancy Howell in the late 1960s about the !Kung San people, who live in the Kalahari Desert between Botswana, Namibia, and Angola.
# Data
# ==============================================================================
url = ('https://raw.githubusercontent.com/JoaquinAmatRodrigo/' +
'Estadistica-machine-learning-python/master/data/Howell1.csv')
data = pd.read_csv(url)
print(data.info())
data.head(4)
<class 'pandas.core.frame.DataFrame'> RangeIndex: 544 entries, 0 to 543 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 height 544 non-null float64 1 weight 544 non-null float64 2 age 544 non-null float64 3 male 544 non-null int64 dtypes: float64(3), int64(1) memory usage: 17.1 KB None
| height | weight | age | male | |
|---|---|---|---|---|
| 0 | 151.765 | 47.825606 | 63.0 | 1 |
| 1 | 139.700 | 36.485807 | 63.0 | 0 |
| 2 | 136.525 | 31.864838 | 65.0 | 0 |
| 3 | 156.845 | 53.041914 | 41.0 | 1 |
From all available data, only women older than 15 years are selected.
data = data[(data.age > 15) & (data.male == 0)]
weight = data['weight']
Graphical methods¶
One of the most commonly used graphical methods for normality analysis consists of representing the data using a histogram and overlaying the curve of a normal distribution with the same mean and standard deviation as the available data.
# Histogram + theoretical normal curve
# ==============================================================================
# Mean (mu) and standard deviation (sigma) values of the data
mu, sigma = stats.norm.fit(weight)
# Theoretical values of the normal in the observed range
x_hat = np.linspace(min(weight), max(weight), num=100)
y_hat = stats.norm.pdf(x_hat, mu, sigma)
# Plot
fig, ax = plt.subplots(figsize=(6, 3))
ax.plot(x_hat, y_hat, linewidth=2, color='firebrick', label='normal')
ax.hist(x=weight, density=True, bins=30, color="#3182bd", alpha=0.5)
ax.plot(weight, np.full_like(weight, -0.01), '|k', markeredgewidth=1)
ax.set_title('Weight distribution of women older than 15 years')
ax.set_xlabel('weight')
ax.set_ylabel('Probability density')
ax.legend();
Another frequently used representation is the quantile-quantile plot (Q-Q plot). These plots compare the quantiles of the observed distribution with the theoretical quantiles of a normal distribution with the same mean and standard deviation as the data. The closer the data are to a normal distribution, the more aligned the points are around the line.
# Q-Q plot
# ==============================================================================
fig, ax = plt.subplots(figsize=(6, 3))
sm.qqplot(
data = weight,
fit = True,
line = 'q',
alpha = 0.4,
ax = ax
)
ref_line = ax.lines[1]
ref_line.set_color('black')
ref_line.set_linewidth(1.5)
ax.set_title('Q-Q plot of weight of women older than 15 years');
Analytical methods: skewness and kurtosis¶
The statistics of skewness and kurtosis can be used to detect deviations from normality. The following are some rules of thumb:
| Range of Skewness/Kurtosis | Interpretation | Action |
|---|---|---|
| -0.5 to +0.5 | Symmetric | Distribution is approximately normal; safe to use parametric tests |
| -1 to -0.5 or +0.5 to +1 | Moderately Skewed | Slight deviation; usually acceptable for most analyses |
| -2 to -1 or +1 to +2 | Highly Skewed | Evident deviation; acceptable for some robust tests, but proceed with caution |
| <-2 or >+2 | Extreme | Substantial non-normality; consider transforming data or using non-parametric tests |
⚠️ Warning
These rules of thumb apply specifically to Excess Kurtosis, where a perfectly normal distribution has a value of 0. This is the default metric used in most major statistical software, including SPSS, Excel, and Python (SciPy). However, be aware that some software (such as Stata) reports "Raw Kurtosis" where a normal distribution has a baseline value of 3. If your output reports Raw Kurtosis, you must subtract 3 from the reported value. Skewness generally defaults to 0 for a normal distribution across almost all major software packages, so no conversion is usually necessary.
print('Kurtosis:', stats.kurtosis(weight))
print('Skewness:', stats.skew(weight))
Kurtosis: 0.05524614843093856 Skewness: 0.032122514283202334
Hypothesis tests¶
The Shapiro-Wilk test and D'Agostino's K-squared test are two of the most commonly used hypothesis tests to analyze normality. In both, the null hypothesis is that the data come from a normal distribution.
The p-value of these tests indicates the probability of obtaining data like those observed if they truly came from a population with a normal distribution with the same mean and deviation as these. Therefore, if the p-value is less than a certain value (typically 0.05), then it is considered that there is sufficient evidence to reject normality.
The Shapiro-Wilk test is not recommended when there is a lot of data (more than 50) due to its high sensitivity to small deviations from normality.
# Shapiro-Wilk test
# ==============================================================================
shapiro_test = stats.shapiro(weight)
shapiro_test
ShapiroResult(statistic=np.float64(0.9963739538348422), pvalue=np.float64(0.924083667304126))
# D'Agostino's K-squared test
# ==============================================================================
k2, p_value = stats.normaltest(weight)
print(f"Statistic = {k2}, p-value = {p_value}")
Statistic = 0.19896549779904893, p-value = 0.9053055672511008
Neither test shows evidence to reject the hypothesis that the data are normally distributed (p-value very close to 1).
When these tests are used to verify the conditions of parametric methods, for example a t-test or an ANOVA, it is important to keep in mind that, being p-values, the larger the sample size, the more statistical power they have and the easier it is to find evidence against the null hypothesis of normality. At the same time, the larger the sample size, the less sensitive parametric methods are to lack of normality. For this reason, it is important not to base conclusions solely on the p-value of the test, but also to consider the graphical representation and the sample size.
Consequences of lack of normality¶
The inability to assume normality primarily affects parametric hypothesis tests (t-test, ANOVA,...) and regression models. The main consequences of lack of normality are:
Least squares estimators are not efficient (of minimum variance).
Confidence intervals of model parameters and significance tests are only approximate and not exact.
The statistical tests presented require that the population from which the sample comes has a normal distribution, not the sample itself. If the sample is normally distributed, it can be accepted that the population of origin is as well. In the case that the sample is not normally distributed but there is certainty that the population of origin is, then it may be justified to accept the results obtained by the parametric tests as valid.
Session information¶
import session_info
session_info.show(html=False)
----- matplotlib 3.10.8 numpy 2.2.6 pandas 2.3.3 scipy 1.15.3 session_info v1.0.1 statsmodels 0.14.6 ----- IPython 9.8.0 jupyter_client 8.7.0 jupyter_core 5.9.1 ----- Python 3.13.11 | packaged by Anaconda, Inc. | (main, Dec 10 2025, 21:28:48) [GCC 14.3.0] Linux-6.14.0-37-generic-x86_64-with-glibc2.39 ----- Session information updated at 2026-01-14 13:07
Bibliography¶
OpenIntro Statistics: Fourth Edition by David Diez, Mine Çetinkaya-Rundel, Christopher Barr
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591-611
D'Agostino, R. B. (1971), "An omnibus test of normality for moderate and large sample size", Biometrika, 58, 341-348
D'Agostino, R. and Pearson, E. S. (1973), "Tests for departure from normality", Biometrika, 60, 613-622
https://www.itl.nist.gov/div898/handbook/prc/section2/prc213.htm
Citation instructions¶
How to cite this document?
If you use this document or any part of it, we appreciate you citing it. Thank you very much!
Normality analysis with Python by Joaquín Amat Rodrigo, available under an Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 DEED) license at https://cienciadedatos.net/documentos/pystats06-normality-tests-python.html
Did you like the article? Your help is important
Your contribution will help me continue generating free educational content. Thank you very much! 😊
This document created by Joaquín Amat Rodrigo is licensed under Attribution-NonCommercial-ShareAlike 4.0 International.
Allowed:
-
Share: copy and redistribute the material in any medium or format.
-
Adapt: remix, transform, and build upon the material.
Under the following terms:
-
Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
-
NonCommercial: You may not use the material for commercial purposes.
-
ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
