More about Data Science and Statistics

Introduction

Imagine you manage an e-commerce platform and want to sort your product catalog based on the quality perceived by users. You have a review system where customers can rate each product as "positive" or "negative". A first approach to ranking products is to calculate the percentage of positive reviews ($k/N$) and sort them from highest to lowest. However, this method has a major problem: it does not account for the statistical confidence provided by the sample size (number of reviews).

To illustrate this problem, consider the following three products with their respective reviews:

Product Positive reviews Total reviews % positive reviews
Bluetooth Headphones 1 1 100%
Bestseller Novel 4 5 80%
Gaming Laptop 95 100 95%

The simple percentage of positive reviews places the headphones with a single review ahead of a laptop with 95 positive reviews out of 100. Intuitively we know the laptop is a better product, but the simple percentage completely ignores the statistical confidence provided by the sample size.

This is a classic problem in statistics known as the problem of proportion estimation with small samples, and it appears in many real systems:

  • Ranking forum posts or social media content by number of "likes" or "upvotes".
  • Ranking restaurants in a city by number of positive reviews.
  • Ranking athletes by their success rate (baskets, serves, penalties, etc.).

This document studies and compares these four statistical approaches:

  • Bayesian Average (Empirical Bayes): Assumes every product is "average" until it proves otherwise with enough reviews.

  • Beta-Binomial Model: The theoretical basis of the Bayesian average, which also allows computing 95% credibility intervals.

  • Wilson Interval Lower Bound: Ranks by the reasonable worst-case performance of each product, not by its mean.

  • Minimum Threshold + Simple Percentage: Excludes from the ranking products with fewer than $N_{min}$ reviews.

Libraries

The libraries used in this document are:

# Data processing
# ==============================================================================
import numpy as np
import pandas as pd

# Statistics
# ==============================================================================
from scipy.stats import norm, beta

# Visualization
# ==============================================================================
import matplotlib.pyplot as plt

Statistical Foundations

The Problem with Simple Proportions

When a customer leaves a positive or negative review, we are dealing with a Bernoulli trial: the outcome is binary (positive = 1, negative = 0) with an unknown probability $p$ of being positive. The total number of positive reviews $k$ out of $N$ total reviews follows a Binomial distribution:

$$k \sim \text{Binomial}(N,\, p)$$

The most natural estimate of $p$ is the maximum likelihood estimator:

$$\hat{p}_{simple} = \frac{k}{N}$$

This estimator is unbiased and consistent, but presents a critical problem when $N$ is small: its variance is very high.

$$\text{Var}(\hat{p}) = \frac{p(1-p)}{N}$$

With $N=1$, the variance is maximized. The estimator only takes the values 0 or 1, with no discriminative power.

✏️ Note

The problem is not the calculation itself, but the uncertainty. A product with 1 positive review out of 1 total review has a 100% rate, but that single data point does not tell us whether its true probability of a positive review is 50%, 80%, or 99%. We need more data to reduce that uncertainty.

Bayesian Average (Empirical Bayes)

The Empirical Bayes solution starts from an intuitive idea: every new product is, in principle, as good as the catalog average, until its reviews prove otherwise. The formula adds pseudo-counts to each product based on the global mean:

$$\hat{p}_{bayes} = \frac{k + C \cdot \mu}{N + C}$$

Where the parameters are:

Parameter Description
$k$ Number of positive reviews for the product
$N$ Total number of reviews for the product
$\mu$ Average percentage of positive reviews across the entire catalog (weighted global mean: $\sum k_i / \sum N_i$)
$C$ Confidence constant: how much weight we give to the global average. Typically set to the mean of $N$ across the catalog

Why does it work? The formula can be rewritten as a weighted average between the global mean and the product's observed rate:

$$\hat{p}_{bayes} = \underbrace{\frac{N}{N+C}}_{\text{weight of data}} \cdot \frac{k}{N} \;\;+\;\; \underbrace{\frac{C}{N+C}}_{\text{weight of prior}} \cdot \mu$$

When $N \ll C$, the prior weight dominates and the estimate is pulled toward $\mu$. When $N \gg C$, the data weight dominates and the estimate converges to $k/N$. The parameter $C$ controls exactly the number of reviews a product needs before the system "trusts" its own data over the catalog average.

⚠️ When NOT to use the Bayesian Average?

This method assumes that the global mean μ is a meaningful reference for all products. If the catalog is very heterogeneous (for example, mixing textbooks with electronics and clothing), the global mean loses significance as a prior. In that case, it is preferable to compute μ and C by product category.

💡 Tip: choosing C

A practical and common choice is to use the average number of reviews per product across the entire catalog. This means a new product must accumulate at least as many reviews as the typical product before the system fully trusts its real data. Note that C is a hyperparameter: a larger C means more aggressive shrinkage toward the global mean, while a smaller C lets even products with few reviews rely on their own data.

Sensitivity of estimates to C: The following table shows how the Bayesian Average estimate of two products changes as $C$ varies. Notice how a larger $C$ forces both products closer to the global mean $\mu$, regardless of their observed rate.

# Sensitivity of Bayesian Average to hyperparameter C
# ==============================================================================
mu_global = 367 / 600  # weighted global mean for this catalog

products_sensitivity = {
    'Bluetooth Headphones': {'k': 1, 'N': 1},
    'Gaming Laptop': {'k': 95, 'N': 100},
}

C_values = [5, 10, 25, 50, 75, 150, 300]

rows = []
for C_val in C_values:
    row = {'C': C_val}
    for name, d in products_sensitivity.items():
        row[name] = (d['k'] + C_val * mu_global) / (d['N'] + C_val)
    rows.append(row)

sensitivity_df = pd.DataFrame(rows).set_index('C')
sensitivity_df.loc['mu (prior)'] = {name: mu_global for name in products_sensitivity}
sensitivity_df.style.format('{:.3f}').set_caption(
    'Bayesian Average by C value (last row = global mean μ used as prior)'
)
Bayesian Average by C value (last row = global mean μ used as prior)
  Bluetooth Headphones Gaming Laptop
C    
5 0.676 0.934
10 0.647 0.919
25 0.627 0.882
50 0.619 0.837
75 0.617 0.805
150 0.614 0.747
300 0.613 0.696
mu (prior) 0.612 0.612

Connection to the Beta-Binomial Model

The Bayesian Average formula is not arbitrary: it is exactly the mean of the posterior distribution of a Bayesian model with a Beta prior and Binomial likelihood.

If we assume a prior $p \sim \text{Beta}(\alpha_0, \beta_0)$ with:

$$\alpha_0 = C \cdot \mu \qquad \beta_0 = C \cdot (1 - \mu)$$

After observing $k$ positive reviews out of $N$ total, the posterior distribution is:

$$p \mid k,N \sim \text{Beta}(\alpha_0 + k,\; \beta_0 + N - k)$$

The mean of this posterior distribution is exactly:

$$\mathbb{E}[p \mid k, N] = \frac{\alpha_0 + k}{\alpha_0 + \beta_0 + N} = \frac{k + C \cdot \mu}{N + C}$$

Which is the Bayesian Average formula. The additional advantage of knowing the full posterior distribution is that it allows computing 95% credibility intervals analytically using scipy.stats.beta.ppf, without the need for simulation.

$$\text{CI}_{95\%} = \left[\text{Beta}^{-1}(0.025,\; \alpha_{post},\; \beta_{post}),\; \text{Beta}^{-1}(0.975,\; \alpha_{post},\; \beta_{post})\right]$$

✏️ Note: Why is PyMC not needed?

PyMC is a probabilistic programming library that uses MCMC simulation to approximate complex posterior distributions. For this problem, Beta-Binomial conjugacy gives us an exact analytical solution: the posterior distribution has a known closed form. No simulation is needed; it is enough to compute αpost and βpost and evaluate the distribution function with scipy.stats.beta.

Wilson Interval Lower Bound

The Wilson interval is a confidence interval for proportions that remains valid even for small $N$, unlike the Wald interval (the classic normal approximation $\hat{p} \pm z\sqrt{\hat{p}(1-\hat{p})/N}$), which can produce values outside $[0, 1]$ and whose coverage degrades severely with few observations. Rather than ranking by the point estimate $k/N$, this method ranks by the lower bound of the interval: the most pessimistic value of $p$ that is still statistically compatible with the observed data at a given confidence level.

The lower bound of the Wilson interval at confidence level $1-\alpha$ is:

$$w^- = \frac{\hat{p} + \frac{z^2}{2N} - z\sqrt{\frac{\hat{p}(1-\hat{p})}{N} + \frac{z^2}{4N^2}}}{1 + \frac{z^2}{N}}$$

Where $\hat{p} = k/N$ and $z = z_{1-\alpha/2}$ is the quantile of the standard normal distribution (e.g., $z = 1.96$ for a 95% confidence level).

Intuitive example:

  • Bluetooth Headphones (1/1, 100%): high uncertainty → its lower bound is approximately 20%.
  • Gaming Laptop (95/100, 95%): low uncertainty → its lower bound is approximately 88%.

✏️ Note

It is a conservative approach: to move up the ranking, a product must prove it is good by accumulating reviews, not just get lucky with the first ones.

✏️ Note: Why not the Wald interval?

The classic Wald interval ± z√((1−)/N) has two failure modes with small samples: (1) it can produce bounds outside [0, 1] (e.g., a product with 0 positives out of 3 reviews gets a lower bound of −0), and (2) its actual coverage probability can be far below the nominal level (e.g., a nominal 95% interval may achieve only 85% actual coverage). Brown, Cai & DasGupta (2001) showed that the Wilson interval provides substantially better coverage properties across all values of p and N. This is the primary reason why platforms use Wilson rather than the simpler Wald formula.

Minimum Threshold + Simple Percentage

The simplest approach consists of:

  1. Define a threshold $N_{min}$ (for example, 10 reviews).
  2. Exclude from the ranking all products with $N < N_{min}$. They are shown as "not enough reviews".
  3. For products that meet the threshold, sort by $\hat{p}_{simple} = k/N$.

This approach is easy to implement and communicate to users. Its main limitation is the hard boundary: a product with 9 reviews is treated identically to one with 0 reviews, while a product with 10 mediocre reviews enters the ranking ahead of a product with 9 excellent ones. The choice of $N_{min}$ is also arbitrary — common values in production systems range from 5 to 50 depending on the category — and it typically requires business context to justify.

The Bayesian and Wilson approaches avoid this binary exclusion by applying a soft penalty: all products appear in the ranking, but those with few reviews are automatically pushed down by the conservative estimate.

Example

Data

A catalog of 8 products is created to illustrate the behavior of each method.

# Data
# ==============================================================================
data = {
    'product': [
        'Bluetooth Headphones',  # 1/1: new and perfect
        'Gaming Laptop',         # 95/100: excellent veteran
        'Phone Case',            # 200/400: mediocre with long history
        'Bestseller Novel',      # 4/5: few but very good reviews
        'Mechanical Keyboard',   # 2/5: few mixed reviews
        'Office Chair',          # 8/9: just below the 10-review threshold
        'Running Shoes',         # 45/60: good with mid-length history
        'Power Bank',            # 12/20: mediocre with sufficient history
    ],
    'category': [
        'Electronics', 'Electronics', 'Accessories', 'Books',
        'Electronics', 'Furniture', 'Fashion', 'Electronics',
    ],
    'positive_reviews': [1, 95, 200, 4, 2, 8, 45, 12],
    'total_reviews':    [1, 100, 400, 5, 5, 9, 60, 20],
}

data = pd.DataFrame(data)
data
product category positive_reviews total_reviews
0 Bluetooth Headphones Electronics 1 1
1 Gaming Laptop Electronics 95 100
2 Phone Case Accessories 200 400
3 Bestseller Novel Books 4 5
4 Mechanical Keyboard Electronics 2 5
5 Office Chair Furniture 8 9
6 Running Shoes Fashion 45 60
7 Power Bank Electronics 12 20

Simple Percentage

# Simple percentage
# ==============================================================================
data['pct_simple'] = data['positive_reviews'] / data['total_reviews']

ranking_simple = (
    data[['product', 'positive_reviews', 'total_reviews', 'pct_simple']]
    .sort_values('pct_simple', ascending=False)
    .reset_index(drop=True)
)
ranking_simple
product positive_reviews total_reviews pct_simple
0 Bluetooth Headphones 1 1 1.000000
1 Gaming Laptop 95 100 0.950000
2 Office Chair 8 9 0.888889
3 Bestseller Novel 4 5 0.800000
4 Running Shoes 45 60 0.750000
5 Power Bank 12 20 0.600000
6 Phone Case 200 400 0.500000
7 Mechanical Keyboard 2 5 0.400000

With the simple percentage, the Bluetooth Headphones (1/1) rank first with 100%, ahead of the Gaming Laptop (95/100). The Office Chair (8/9, ~89%) also appears in high positions despite having very few reviews.

Bayesian Average

The first step is to compute the global catalog parameters:

  • $\mu$: the average percentage of positive reviews across the entire catalog.

  • $C$: the confidence constant, chosen as the mean number of reviews per product.

# Global catalog parameters
# ==============================================================================
mu = data['positive_reviews'].sum() / data['total_reviews'].sum()
C = data['total_reviews'].mean()
print(f'Global average (mu): {mu}')
print(f'Confidence constant (C): {C}')
Global average (mu): 0.6116666666666667
Confidence constant (C): 75.0

✏️ Note: weighted mean vs. arithmetic mean

μ is computed as the weighted global mean Σki / ΣNi, not as the arithmetic mean of individual proportions mean(ki / Ni). These are numerically different. The weighted version gives more influence to products with more reviews, which is appropriate: it reflects the true proportion of positive reviews across all user interactions in the catalog. The arithmetic mean would give a product with 1 review the same influence as a product with 10,000 reviews, distorting the prior.

The next step is to apply the Bayesian Average formula to each product.

# Bayesian Average
# ==============================================================================
k = data['positive_reviews']
N = data['total_reviews']

data['pct_bayesian'] = (k + C * mu) / (N + C)

ranking_bayesian = (
    data[['product', 'positive_reviews', 'total_reviews', 'pct_simple', 'pct_bayesian']]
    .sort_values('pct_bayesian', ascending=False)
    .reset_index(drop=True)
)
ranking_bayesian
product positive_reviews total_reviews pct_simple pct_bayesian
0 Gaming Laptop 95 100 0.950000 0.805000
1 Running Shoes 45 60 0.750000 0.673148
2 Office Chair 8 9 0.888889 0.641369
3 Bestseller Novel 4 5 0.800000 0.623437
4 Bluetooth Headphones 1 1 1.000000 0.616776
5 Power Bank 12 20 0.600000 0.609211
6 Mechanical Keyboard 2 5 0.400000 0.598437
7 Phone Case 200 400 0.500000 0.517632

The Bayesian Average corrects the ranking as expected. The Gaming Laptop rises to first place: with 100 reviews ($N > C$), its observed rate (95%) carries enough weight to overcome the pull of the prior. The Bluetooth Headphones drop substantially: with only 1 review ($N \ll C$), their estimate is almost entirely determined by the prior $\mu$, not by the single positive review — this is the shrinkage effect of Bayesian estimation. The Phone Case illustrates the opposite asymptote: with 400 reviews ($N \gg C$), the prior contributes negligibly and the estimate is nearly identical to its raw proportion (~50%), which is well below the catalog average.

Beta-Binomial Model: 95% Credibility Interval

The Beta-Binomial model allows us to compute 95% credibility intervals for each product. We sort the ranking by the posterior mean (which coincides with the Bayesian Average), but also display the full interval.

# Beta prior parameters
# ==============================================================================
alpha_0 = C * mu
beta_0 = C * (1 - mu)

# Posterior distribution parameters per product
# ==============================================================================
alpha_post = alpha_0 + k
beta_post = beta_0 + (N - k)

# 95% credibility interval
# ==============================================================================
data['cred_95_lower'] = beta.ppf(0.025, alpha_post, beta_post)
data['cred_95_upper'] = beta.ppf(0.975, alpha_post, beta_post)

cols_beta = ['product', 'total_reviews', 'pct_simple', 'pct_bayesian', 'cred_95_lower', 'cred_95_upper']
(
    data[cols_beta]
    .sort_values('pct_bayesian', ascending=False)
    .reset_index(drop=True)
)
product total_reviews pct_simple pct_bayesian cred_95_lower cred_95_upper
0 Gaming Laptop 100 0.950000 0.805000 0.743307 0.860095
1 Running Shoes 60 0.750000 0.673148 0.592037 0.749399
2 Office Chair 9 0.888889 0.641369 0.536522 0.739839
3 Bestseller Novel 5 0.800000 0.623437 0.515312 0.725716
4 Bluetooth Headphones 1 1.000000 0.616776 0.505611 0.722118
5 Power Bank 20 0.600000 0.609211 0.509667 0.704397
6 Mechanical Keyboard 5 0.400000 0.598437 0.489657 0.702554
7 Phone Case 400 0.500000 0.517632 0.472692 0.562431

The credibility interval reveals the true uncertainty behind each point estimate:

  • Bluetooth Headphones (1 review): the interval ranges from ~51% to ~72%. It is the widest in the catalog. With $C \approx 75$ virtual pseudo-reviews and only 1 real observation, the prior has roughly 75× more weight than the data — the posterior is almost entirely determined by the prior, which explains why the interval, while wide, is centered well above 50%.
  • Gaming Laptop (100 reviews): the interval is narrow (~74% to ~86%). With $N > C$, the 100 real reviews outweigh the prior and the system trusts its observed data.
  • Phone Case (400 reviews): the interval is the narrowest of all (~47% to ~56%). With $N \gg C$, the posterior is dominated almost entirely by the 400 observed reviews, and the system knows with high precision that this is a mediocre product.

This is the key insight of Bayesian shrinkage: the credibility interval width is determined not just by $N$, but by the ratio $N/C$. A product needs $N \gg C$ reviews before its uncertainty is comparable to a product with many reviews in a non-Bayesian setting.

Wilson Interval Lower Bound

The Wilson interval lower bound ranks products by their reasonable worst-case performance.

# Wilson Interval Lower Bound
# ==============================================================================
confidence = 0.95  # confidence level
z = norm.ppf(1 - (1 - confidence) / 2)  # z = 1.96 for 95%

p_hat = data['pct_simple']

data['wilson_lower'] = (
    (p_hat + z**2 / (2 * N) - z * np.sqrt(p_hat * (1 - p_hat) / N + z**2 / (4 * N**2)))
    / (1 + z**2 / N)
)

ranking_wilson = (
    data[['product', 'positive_reviews', 'total_reviews', 'pct_simple', 'wilson_lower']]
    .sort_values('wilson_lower', ascending=False)
    .reset_index(drop=True)
)
ranking_wilson
product positive_reviews total_reviews pct_simple wilson_lower
0 Gaming Laptop 95 100 0.950000 0.888250
1 Running Shoes 45 60 0.750000 0.627679
2 Office Chair 8 9 0.888889 0.565000
3 Phone Case 200 400 0.500000 0.451235
4 Power Bank 12 20 0.600000 0.386582
5 Bestseller Novel 4 5 0.800000 0.375535
6 Bluetooth Headphones 1 1 1.000000 0.206549
7 Mechanical Keyboard 2 5 0.400000 0.117621

The Wilson Interval is the most conservative method. It penalizes uncertainty heavily: the Bluetooth Headphones (1/1) drop nearly to the bottom of the ranking because their lower bound is very low (~21%). The Office Chair (8/9) also drops significantly despite its high simple percentage.

⚠️ Edge case: products with zero positive reviews ( = 0)

When = 0, the term (1−) = 0 and the Wilson formula simplifies to w = 0 / (1 + z2/N), which yields exactly 0. The lower bound is well-defined in this case. However, if = 1 (all reviews positive), the formula also yields a well-defined w, but the upper bound w+ would equal 1 only asymptotically. In production systems, it is common to apply a clamp max(0, w) to avoid any floating-point edge cases and to handle products with no reviews at all (N = 0), where the formula is undefined.

Minimum Threshold

The minimum threshold method excludes products with fewer than 10 reviews.

# Minimum review threshold
# ==============================================================================
N_min = 10
print(f'Products with N >= {N_min} reviews (included in the ranking):')
ranking_threshold = (
    data.loc[data['total_reviews'] >= N_min, ['product', 'positive_reviews', 'total_reviews', 'pct_simple']]
    .sort_values('pct_simple', ascending=False)
    .reset_index(drop=True)
)
display(ranking_threshold)

print(f'\nProducts excluded from the ranking (N < {N_min}):')
excluded = data.loc[data['total_reviews'] < N_min, ['product', 'positive_reviews', 'total_reviews']]
excluded
Products with N >= 10 reviews (included in the ranking):
product positive_reviews total_reviews pct_simple
0 Gaming Laptop 95 100 0.95
1 Running Shoes 45 60 0.75
2 Power Bank 12 20 0.60
3 Phone Case 200 400 0.50
Products excluded from the ranking (N < 10):
product positive_reviews total_reviews
0 Bluetooth Headphones 1 1
3 Bestseller Novel 4 5
4 Mechanical Keyboard 2 5
5 Office Chair 8 9

With $N_{min} = 10$, the Bluetooth Headphones, Bestseller Novel, Mechanical Keyboard, and Office Chair are excluded from the ranking. The latter, which has 89% positive reviews with 9 reviews, does not appear in the ranking despite being the second product with the highest simple percentage.

Ranking Comparison

The following chart shows the position (rank) of each product under each of the four methods. Products with the same position across all methods are stable; products whose rank changes significantly are those most affected by uncertainty in their reviews.

# Ranking comparison across methods
# ==============================================================================
rank_df = data[['product']].copy()

rank_df['Simple %'] = (
    data['pct_simple']
    .rank(ascending=False, method='min')
    .astype(int)
)

rank_df['Bayesian Avg'] = (
    data['pct_bayesian']
    .rank(ascending=False, method='min')
    .astype(int)
)

rank_df['Wilson LB'] = (
    data['wilson_lower']
    .rank(ascending=False, method='min')
    .astype(int)
)

included_mask = data['total_reviews'] >= N_min

rank_df['Min Threshold'] = (
    data['pct_simple']
    .where(included_mask)
    .rank(ascending=False, method='min')
)

rank_df = rank_df.set_index('product')

display(rank_df)

# Bump chart
fig, ax = plt.subplots(figsize=(8, 5))

for product, ranks in rank_df.iterrows():
    ax.plot(
        rank_df.columns,
        ranks,
        marker='o',
        linewidth=2,
        label=product
    )

ax.invert_yaxis()

ax.set_ylabel('Rank')
ax.set_title('Ranking comparison across methods')

ax.grid(axis='y', linestyle='--', alpha=0.3)

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.legend(
    title='Product',
    bbox_to_anchor=(1.02, 1),
    loc='upper left'
)

plt.tight_layout()
plt.show()
Simple % Bayesian Avg Wilson LB Min Threshold
product
Bluetooth Headphones 1 5 7 NaN
Gaming Laptop 2 1 1 1.0
Phone Case 7 8 4 4.0
Bestseller Novel 4 4 6 NaN
Mechanical Keyboard 8 7 8 NaN
Office Chair 3 3 3 NaN
Running Shoes 5 2 2 2.0
Power Bank 6 6 5 3.0

Conclusions

The methods studied solve in different ways the same fundamental problem: the instability of the simple percentage with small samples.

Method Complexity When to use it Advantage Disadvantage
Simple % Very low Never as a final ranking Intuitive and easy to communicate Overvalues products with few reviews
Bayesian Average Low Unified ranking where all products appear from day one Fair to new and established products; easy to interpret Requires the global mean $\mu$ to be representative of the catalog
Beta-Binomial Low Same as Bayesian, plus credibility interval Provides full uncertainty; no MCMC required Slightly more complex to implement
Wilson Interval Low-medium Highly competitive environments where consistency is rewarded Very conservative; avoids the "lucky start" effect Penalizes new products heavily even if they show a good early signal
Minimum Threshold Very low Simple systems where temporary exclusion is acceptable Easy to implement and communicate to users Completely excludes new products from the ranking

Recommendation by use case:

  • Use the Bayesian Average if you want a unified, fair ranking where all products (new and established) appear from day one without penalizing anyone too harshly. It is the standard on e-commerce platforms.
  • Add the Beta-Binomial credibility interval if you need to communicate uncertainty to the user (for example, showing a confidence bar alongside the star rating).
  • Use the Wilson Interval if the context is highly competitive and you want to be strict, such as in comment or app review rankings, where top results have an enormous impact.
  • Use the minimum threshold only if implementation simplicity is a requirement and temporary exclusion of new products is acceptable for your business.

Session Information

import session_info
session_info.show(html=False)
-----
matplotlib          3.10.8
numpy               1.26.4
pandas              2.2.3
plotly              6.7.0
scipy               1.11.4
session_info        v1.0.1
-----
IPython             9.10.1
jupyter_client      8.8.0
jupyter_core        5.9.1
-----
Python 3.11.15 (main, Mar 11 2026, 17:20:07) [GCC 14.3.0]
Linux-6.17.0-29-generic-x86_64-with-glibc2.39
-----
Session information updated at 2026-06-01 14:10

Citation Instructions

How to cite this document?

If you use this document or any part of it, we appreciate you citing it. Thank you!

Statistical Product Ranking: Empirical Bayes, Wilson Interval, and Minimum Threshold by Joaquín Amat Rodrigo, available under an Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0 DEED) license at https://www.cienciadedatos.net/documentos/pystats13-statistical-ranking-python.html

Did you like the article? Your support matters

Your contribution will help me keep producing free educational content. Thank you so much! 😊

Creative Commons Licence

This document created by Joaquín Amat Rodrigo is licensed under Attribution-NonCommercial-ShareAlike 4.0 International.

Permitted:

  • Share: copy and redistribute the material in any medium or format.

  • Adapt: remix, transform, and build upon the material.

Under the following terms:

  • Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

  • NonCommercial: You may not use the material for commercial purposes.

  • ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.