Probability Part - 1

Aryan
Mar 10
8 min read

Updated: Jun 9

TERMINOLOGY

Random Experiment :-> An experiment is called random experiment if it satisfies the following two conditions:

(i) It has more than one possible outcome.

(ii) It is not possible to predict the outcome in advance.

Trial :-> Trial refers to a single execution of a random experiment. Each trial produces an outcome.

Outcome :-> Outcome refers to a single possible result of a trial.

Sample Space :-> Sample Space of a random experiment is the set of all possible outcomes that can occur. Generally, one random experiment will have one set of sample space.

Event :-> Event is a specific set of outcomes from a random experiment or process. Essentially, it's a subset of the sample space. An event can include a single outcome, or it can include multiple outcomes. One random experiments can have multiple events.

EXAMPLES

1. Rolling a Die

Random Experiment: Rolling a fair six-sided die.
Trial: Each individual roll of the die.
Outcome: A specific number appearing on the die (e.g., 4).
Sample Space (S): {1, 2, 3, 4, 5, 6}
Event: A subset of the sample space, e.g., rolling an even number {2, 4, 6}.

2. Tossing a Coin Twice

Random Experiment: Tossing a fair coin two times.
Trial: Each time the coin is tossed.
Outcome: A specific sequence of heads (H) or tails (T), e.g., (H, T).
Sample Space (S): {(H, H), (H, T), (T, H), (T, T)}
Event: A subset of the sample space, e.g., getting at least one head {(H, H), (H, T), (T, H)}.

3. Titanic Example (Survival Analysis)

Random Experiment: Observing the fate of a randomly selected passenger on the Titanic.
Trial: Selecting a single passenger from the dataset.
Outcome: The survival status of that passenger (e.g., Survived).
Sample Space (S): {Survived, Not Survived}
Event: A subset of the sample space, e.g., the event that a passenger survived {Survived}.

TYPES OF EVENTS

Simple Event: Also known as an elementary event, a simple event consists of exactly one outcome.
- For example, when rolling a fair six-sided die, getting a 3 is a simple event.
Compound Event: A compound event consists of two or more simple events.
- For example, when rolling a die, the event "rolling an odd number" is a compound event because it consists of three simple events: rolling a 1, rolling a 3, or rolling a 5.
Independent Events: Two events are independent if the occurrence of one event does not affect the probability of the occurrence of the other event.
- For example, if you flip a coin and roll a die, the outcome of the coin flip does not affect the outcome of the die roll.
Dependent Events: Events are dependent if the occurrence of one event does affect the probability of the occurrence of the other event.
- For example, if you draw two cards from a deck without replacement, the outcome of the first draw affects the outcome of the second draw because there are fewer cards left in the deck.
Mutually Exclusive Events: Two events are mutually exclusive (or disjoint) if they cannot both occur at the same time.
- For example, when rolling a die, the events "roll a 2" and "roll a 4" are mutually exclusive because a single roll of the die cannot result in both a 2 and a 4.
Exhaustive Events: A set of events is exhaustive if at least one of the events must occur when the experiment is performed.
- For example, when rolling a die, the events "roll an even number" and "roll an odd number" are exhaustive because one or the other must occur on any roll.
Impossible Event and Certain Event:
- Impossible Event: An event that can never happen. Example: Rolling a 7 on a standard six-sided die.
- Certain Event: An event that always happens. Example: Rolling a number between 1 and 6 on a six-sided die.

WHAT IS PROBABILITY ?

In simplest terms, probability is a measure of the likelihood that a particular event will occur. It is a fundamental concept in statistics and is used to make predictions and informed decisions in a wide range of disciplines, including science, engineering, medicine, economics, and social sciences.

Probability is usually expressed as a number between 0 and 1, inclusive:

A probability of 0 means that an event will not happen.
A probability of 1 means that an event will certainly happen.
A probability of 0.5 means that an event will happen half the time (or that it is as likely to happen as not to happen).

EMPIRICAL PROBABILITY

Definition: Empirical probability (also known as experimental probability) is calculated based on actual experiments or past data, rather than theoretical assumptions.

Example 1: Coin Toss

Suppose a fair coin is tossed 100 times.
It lands on heads 55 times.
Empirical Probability of getting heads: P(H) = 55/100=0.55 or 55%

Example 2: Drawing Marbles

A bag contains 50 marbles (20 Red, 15 Blue, 15 Green).
A marble is drawn, recorded, and replaced, repeating for 200 trials.
Observed results:
- Red marble: 80 times
- Blue marble: 70 times
- Green marble: 50 times
Empirical Probability of drawing a red marble: P(R) = 0.4 or 40 %

THEORETICAL PROBABILITY

Definition: Theoretical probability is used when all outcomes in a sample space are equally likely to occur.

Formula :

Example 1: Tossing a Fair Coin 3 Times

Sample Space (S): {HHH,HHT,HTH,HTT,THH,THT,TTH,TTT}

Total outcomes = 8
Favorable outcomes for exactly 2 heads:
- HHT, HTH, THH (3 outcomes)
Probability : P(Exactly 2 Heads) = 3/8=0.375 or 37.5%

Example 2: Rolling Two Dice - Sum = 7

Total Outcomes:

Since each die has 6 faces, total possible outcomes: 6 * 6 = 36

Favorable Outcomes for Sum = 7: (1,6),(2,5),(3,4),(4,3)(5,2),(6,1)

Number of favorable outcomes = 6
Probability: P(Sum = 7) = 6/36 = 1/6 = 16.67%

UNDERSTANDING RANDOM VARIABLE

A random variable is a function that assigns numerical values to the possible outcomes of a random experiment. It plays a fundamental role in probability and statistics, helping to model and analyze uncertain events. Random variables can be categorized into two types: discrete, which take on specific countable values, and continuous, which can assume any value within a given range.

Key Points About Random Variable

A random variable is a function that maps outcomes of a random process to numerical values.
It can be classified as discrete (taking specific values) or continuous (taking values from a continuous range).
Random variables help quantify uncertainty and are used to construct probability distributions.
They are widely applied in probability theory, data science, and financial risk analysis.

Types of Random Variable

Discrete Random Variables

A discrete random variable assigns specific values to different outcomes of an experiment. These values are countable, and each has an associated probability. For example, rolling a six-sided die results in one of six distinct numbers, each with a probability of 1/6.

Example: Suppose a coin is flipped three times, and we define X as the number of heads observed. The function X can take values from the set {0, 1, 2, 3}, corresponding to cases where no heads, one head, two heads, or three heads appear.

Continuous Random Variables

A continuous random variable can take an infinite number of values within a given range. These variables are often used to measure physical quantities such as time, temperature, or weight.

Example: If Y represents the total amount of rainfall in a city over a year, it could be any value within a certain range, such as 25.4 cm or 25.432 cm, meaning the number of possible values is infinite.

Example of a Random Variable

A typical example of a random variable is the outcome of a coin toss. Consider a probability distribution in which the outcomes of a random event are not equally likely to happen. If the random variable Y is the number of heads we get from tossing two coins, then Y could be 0, 1, or 2. This means that we could have no heads, one head, or both heads on a two-coin toss.

However, the two coins land in four different ways: TT, HT, TH, and HH. Therefore, the P(Y=0) = 1/4 since we have one chance of getting no heads (i.e., two tails [TT] when the coins are tossed). Similarly, the probability of getting two heads (HH) is also 1/4. Notice that getting one head has a likelihood of occurring twice: in HT and TH. In this case, P (Y=1) = 2/4 = 1/2.

PROBABILITY DISTRIBUTION

A probability distribution is a list of all of the possible outcomes of a random variable along with their corresponding probability values.

Example 1: Tossing a Fair Coin

Let X be the random variable representing the number of heads obtained in a single toss of a fair coin.

Possible values of X:

X = 0 (if we get tails)
X = 1 (if we get heads)

Probability distribution:

X	0	1
P(X)	1/2	1/2

Example 2: Rolling One Die

Let X be the random variable representing the outcome when rolling a fair six-sided die.

Possible values of X : {1,2,3,4,5,6}

Since each outcome is equally likely, the probability distribution is:

X	1	2	3	4	5	6
P(X)	1/6	1/6	1/6	1/6	1/6	1/6

Example 3: Rolling Two Dice

Let X be the random variable representing the sum of numbers obtained when rolling two fair six-sided dice. The possible values of X range from 2 to 12.

We determine P(X) by counting the number of ways each sum can occur :

SUM X	2	3	4	5	6	7	8	9	10	11	12
WAYS TO GET X	1	2	3	4	5	6	5	4	3	2	1
P(X)	1/36	2/36	3/36	4/36	5/36	6/36	5/36	4/36	3/36	2/36	1/36

PROBABILITY DISTRIBUTION FUNCTION

When the number of possible values of a random variable increases, calculating probabilities individually becomes difficult. That’s why we use a Probability Distribution Function (PDF) to simplify the process.

If a random variable has many possible values, calculating the probability of each one separately is inefficient.
A Probability Distribution Function provides a formula or pattern to compute probabilities systematically.
It helps in handling large datasets and continuous variables, where exact probability calculations become impractical.

MEAN OF A RANDOM VARIABLE

The mean of a random variable , often called the expected value is essentially the average outcome of a random process that is repeated many times. More technically , it's a weighted average of the possible outcomes of the random variable, where each outcome is weighted by its probability of occurrence.

Example: Rolling a Die

Since each outcome is equally likely, the probability distribution is :

X	1	2	3	4	5	6
P(X)	1/6	1/6	1/6	1/6	1/6	1/6

We need to find the mean of the random variable X, which represents the average value over multiple trials.

Example Calculation

Suppose we have a set of numbers: 5, 3, 4, 5, 3, 3. The mean of these numbers is calculated as :

We can also express this calculation differently :

This approach provides a new perspective on calculating the mean while yielding the same result.

Mean of a Fair Die Roll

Since each face of the die appears with a probability of 1/6, the mean value of a die roll is :

Thus, the expected value (mean) of rolling a fair six-sided die is 3.5 .

Variance of Random Variable

The variance of a random variable is a statistical measurement that describes how much individual observations in a group differ from the mean (expected value).

Example: Rolling a Die

Since each outcome is equally likely, the probability distribution is:

X	1	2	3	4	5	6
P(X)	1/6	1/6	1/6	1/6	1/6	1/6

Now, we will calculate the variance.

As we know, the standard formula for variance is :

CONTINGENCY TABLE

A contingency table (also known as a cross-tabulation or crosstab) is a type of table used in statistics to display the frequency distribution of variables. It is commonly used to analyze the relationship between two categorical variables.

Structure of a Contingency Table

A contingency table is structured in a matrix format, where:

Rows represent categories of one variable.
Columns represent categories of another variable.
Cells contain the frequency counts (number of occurrences) for each combination of row and column variables.

Example of a Contingency Table

Consider a study that examines the relationship between gender and preference for a product.

Gender	Likes Product	Dislikes Product	Total
Male	30	20	50
Female	25	25	50
Total	55	45	100

Types of Contingency Tables

2×2 Contingency Table – Contains two rows and two columns (excluding totals).
Larger Contingency Tables – Can have multiple rows and columns if more categories are involved.

Probability Part - 1

Recent Posts

© 2025 Aryan Upadhyay |