Decoding the Binomial Distribution: A Fundamental Concept for Data Scientists | by Egor Howell

Understanding the basic building blocks of the binomial distribution

The binomial distribution is a widely used statistical distribution that Data Scientists should be familiar with, as it appears in numerous contexts. One notable example is its application in supervised learning problems for classification, where the loss function, the cross-entropy loss, is derived from the binomial distribution. In this post, we will explore the intuition, theory, and examples associated with this distribution.

The binomial distribution is a discrete distribution that measures the likelihood of achieving a specific number of successes in a given number of trials. For instance, it can answer the questions “What is the probability of obtaining 2 heads from 5 coin flips?” In this context, each trial represents either a success (coin landing on heads) or a failure (coin landing on tails). These individual trials are known as a Bernoulli trial or process, where each trial essentially poses a binary (yes-no) question.

You can view that the probability of a success is p, therefore, there is a probability of 1-p for it to be a failure, given its binary nature. Consequently, the probability mass function (PMF) takes the following form:

Equation in LaTex by author.

Where X is a random variable from the Bernoulli distribution and k is the outcome of the trial. Notice how if k=1, then the probability is just p.

The binomial distribution is a combination of Bernoulli trials with a given number of successes, k. To derive the binomial PMF, we incorporate both the binomial coefficient and the number of trials, n, into the Bernoulli PMF:

In general, there are conditions for the binomial distribution:

Number of trials, n, is fixed
Each trial is independent
Each trial has two outcomes
The probability of a success, p, is the same for every trial

Let’s go back to the question we posed before: “What is the probability of obtaining 2 heads from 5 coin flips?”

Notice that the probability of obtaining 2 heads is reasonably small. It is important to remember that this is the probability for exactly 2 heads. Therefore, there are additional possibilities where three, four, or even five heads occur.

To gain a deeper understanding, let’s visualize the distribution of probabilities by plotting it as a function of the number of successes. Essentially, we will be displaying the probability mass function (PMF).

GitHub Gist by author.

We observe that the most probable outcome, or the expected value, is 0.5, which makes sense. However, do you notice any other characteristics regarding the shape of the distribution? How about if we plot it for 50 trials:

Notice how it increasingly resembles a normal distribution. This phenomenon is referred to as central limit theorem! The central limit theorem states that as the sample size grows larger, the distribution tends to a normal distribution.

Link here for an article that explains the central limit theorem in more depth

In this blog post, we have explored the binomial distribution. This discrete distribution calculates the probability of achieving a specific number of successes within a given number of trials. The binomial probability distribution finds application in diverse industries, including commodity trading, insurance and supply chain operations. Therefore, it is a valuable concept for Data Scientists to be aware of.

The full code is available on my GitHub here:

(All emojis designed by OpenMoji — the open-source emoji and icon project. License: CC BY-SA 4.0)

Source link

Leave a Reply Cancel reply

Related Stories

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

VC-Dimension V.S. Inductive Bias V.S. Biology V.S. Physical Laws : Comprehensive Multi-Disciplinary Table of Machine Learning Classifiers | by Medium_AI_CS_ML | Apr, 2024

Why Machine Learning Is Worth Talking About? | by jupytermishra | Apr, 2024

You may have missed

The Weekly Reorg: Bitcoin Fashion Week

Virtual curating frees artist – Hypergrid Business

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

Azteco Is Helping Millions Buy Bitcoin Without Sharing Their Identity