top of page

Probability Part - 1

  • Writer: Aryan
    Aryan
  • Mar 10
  • 8 min read

Updated: Jun 9

TERMINOLOGY

 

  1. Random Experiment :-> An experiment is called random experiment if it satisfies the following two conditions:

(i) It has more than one possible outcome.

(ii) It is not possible to predict the outcome in advance.

 

  1. Trial :-> Trial refers to a single execution of a random experiment. Each trial produces an outcome.

 

  1. Outcome :-> Outcome refers to a single possible result of a trial.

 

  1. Sample Space :-> Sample Space of a random experiment is the set of all possible outcomes that can occur. Generally, one random experiment will have one set of sample space.

 

  1. Event :-> Event is a specific set of outcomes from a random experiment or process. Essentially, it's a subset of the sample space. An event can include a single outcome, or it can include multiple outcomes. One random experiments can have multiple events.


EXAMPLES

 

1. Rolling a Die

  • Random Experiment: Rolling a fair six-sided die.

  • Trial: Each individual roll of the die.

  • Outcome: A specific number appearing on the die (e.g., 4).

  • Sample Space (S): {1, 2, 3, 4, 5, 6}

  • Event: A subset of the sample space, e.g., rolling an even number {2, 4, 6}.

 

2. Tossing a Coin Twice

  • Random Experiment: Tossing a fair coin two times.

  • Trial: Each time the coin is tossed.

  • Outcome: A specific sequence of heads (H) or tails (T), e.g., (H, T).

  • Sample Space (S): {(H, H), (H, T), (T, H), (T, T)}

  • Event: A subset of the sample space, e.g., getting at least one head {(H, H), (H, T), (T, H)}.

 

3. Titanic Example (Survival Analysis)

  • Random Experiment: Observing the fate of a randomly selected passenger on the Titanic.

  • Trial: Selecting a single passenger from the dataset.

  • Outcome: The survival status of that passenger (e.g., Survived).

  • Sample Space (S): {Survived, Not Survived}

  • Event: A subset of the sample space, e.g., the event that a passenger survived {Survived}.


 TYPES OF EVENTS

 

  1. Simple Event: Also known as an elementary event, a simple event consists of exactly one outcome.

    • For example, when rolling a fair six-sided die, getting a 3 is a simple event.


  2. Compound Event: A compound event consists of two or more simple events.

    • For example, when rolling a die, the event "rolling an odd number" is a compound event because it consists of three simple events: rolling a 1, rolling a 3, or rolling a 5.


  3. Independent Events: Two events are independent if the occurrence of one event does not affect the probability of the occurrence of the other event.

    • For example, if you flip a coin and roll a die, the outcome of the coin flip does not affect the outcome of the die roll.


  4. Dependent Events: Events are dependent if the occurrence of one event does affect the probability of the occurrence of the other event.

    • For example, if you draw two cards from a deck without replacement, the outcome of the first draw affects the outcome of the second draw because there are fewer cards left in the deck.


  5. Mutually Exclusive Events: Two events are mutually exclusive (or disjoint) if they cannot both occur at the same time.

    • For example, when rolling a die, the events "roll a 2" and "roll a 4" are mutually exclusive because a single roll of the die cannot result in both a 2 and a 4.


  6. Exhaustive Events: A set of events is exhaustive if at least one of the events must occur when the experiment is performed.

    • For example, when rolling a die, the events "roll an even number" and "roll an odd number" are exhaustive because one or the other must occur on any roll.


  7. Impossible Event and Certain Event:

    • Impossible Event: An event that can never happen. Example: Rolling a 7 on a standard six-sided die.

    • Certain Event: An event that always happens. Example: Rolling a number between 1 and 6 on a six-sided die.


WHAT IS PROBABILITY ?

 

In simplest terms, probability is a measure of the likelihood that a particular event will occur. It is a fundamental concept in statistics and is used to make predictions and informed decisions in a wide range of disciplines, including science, engineering, medicine, economics, and social sciences.

Probability is usually expressed as a number between 0 and 1, inclusive:

  • A probability of 0 means that an event will not happen.

  • A probability of 1 means that an event will certainly  happen.

  • A probability of 0.5 means that an event will happen half the time (or that it is as likely to happen as not to happen).


EMPIRICAL PROBABILITY

 

Definition: Empirical probability (also known as experimental probability) is calculated based on actual experiments or past data, rather than theoretical assumptions.

ree

Example 1: Coin Toss

  • Suppose a fair coin is tossed 100 times.

  • It lands on heads 55 times.

  • Empirical Probability of getting heads: P(H) = 55/100=0.55 or 55%


Example 2: Drawing Marbles

  • A bag contains 50 marbles (20 Red, 15 Blue, 15 Green).

  • A marble is drawn, recorded, and replaced, repeating for 200 trials.

  • Observed results:

    • Red marble: 80 times

    • Blue marble: 70 times

    • Green marble: 50 times

  • Empirical Probability of drawing a red marble: P(R) = 0.4 or 40 %


THEORETICAL PROBABILITY

 

Definition: Theoretical probability is used when all outcomes in a sample space are equally likely to occur.

Formula :

ree

Example 1: Tossing a Fair Coin 3 Times

Sample Space (S): {HHH,HHT,HTH,HTT,THH,THT,TTH,TTT}

  • Total outcomes = 8

  • Favorable outcomes for exactly 2 heads:

    • HHT, HTH, THH (3 outcomes)

  • Probability : P(Exactly 2 Heads) = 3/8=0.375 or 37.5%


Example 2: Rolling Two Dice - Sum = 7

Total Outcomes:

  • Since each die has 6 faces, total possible outcomes: 6 * 6 = 36

Favorable Outcomes for Sum = 7: (1,6),(2,5),(3,4),(4,3)(5,2),(6,1)

  • Number of favorable outcomes = 6

  • Probability: P(Sum = 7) = 6/36 = 1/6 = 16.67%


UNDERSTANDING RANDOM VARIABLE

 

A random variable is a function that assigns numerical values to the possible outcomes of a random experiment. It plays a fundamental role in probability and statistics, helping to model and analyze uncertain events. Random variables can be categorized into two types: discrete, which take on specific countable values, and continuous, which can assume any value within a given range.

 

Key Points About Random Variable


  • A random variable is a function that maps outcomes of a random process to numerical values.

  • It can be classified as discrete (taking specific values) or continuous (taking values from a continuous range).

  • Random variables help quantify uncertainty and are used to construct probability distributions.

  • They are widely applied in probability theory, data science, and financial risk analysis.


Types of Random Variable

 

Discrete Random Variables


A discrete random variable assigns specific values to different outcomes of an experiment. These values are countable, and each has an associated probability. For example, rolling a six-sided die results in one of six distinct numbers, each with a probability of 1/6.

Example: Suppose a coin is flipped three times, and we define X as the number of heads observed. The function X can take values from the set {0, 1, 2, 3}, corresponding to cases where no heads, one head, two heads, or three heads appear.

 

Continuous Random Variables


A continuous random variable can take an infinite number of values within a given range. These variables are often used to measure physical quantities such as time, temperature, or weight.

Example: If Y represents the total amount of rainfall in a city over a year, it could be any value within a certain range, such as 25.4 cm or 25.432 cm, meaning the number of possible values is infinite.

 

Example of a Random Variable


A typical example of a random variable is the outcome of a coin toss. Consider a probability distribution in which the outcomes of a random event are not equally likely to happen. If the random variable Y is the number of heads we get from tossing two coins, then Y could be 0, 1, or 2. This means that we could have no heads, one head, or both heads on a two-coin toss.

However, the two coins land in four different ways: TT, HT, TH, and HH. Therefore, the P(Y=0) = 1/4 since we have one chance of getting no heads (i.e., two tails [TT] when the coins are tossed). Similarly, the probability of getting two heads (HH) is also 1/4. Notice that getting one head has a likelihood of occurring twice: in HT and TH. In this case, P (Y=1) = 2/4 = 1/2.


PROBABILITY DISTRIBUTION

 

A probability distribution is a list of all of the possible outcomes of a random variable along with their corresponding probability values.

 

Example 1: Tossing a Fair Coin


Let X be the random variable representing the number of heads obtained in a single toss of a fair coin.

Possible values of X:

  • X = 0 (if we get tails)

  • X = 1 (if we get heads)

 

Probability distribution:

X

0

1

P(X)

1/2

1/2

Example 2: Rolling One Die


Let X be the random variable representing the outcome when rolling a fair six-sided die.

Possible values of X : {1,2,3,4,5,6}

Since each outcome is equally likely, the probability distribution is:

X

1

2

3

4

5

6

P(X)

1/6

1/6

1/6

1/6

1/6

1/6

 

Example 3: Rolling Two Dice


Let X be the random variable representing the sum of numbers obtained when rolling two fair six-sided dice. The possible values of X range from 2 to 12.

We determine P(X) by counting the number of ways each sum can occur :

SUM X

2

3

4

5

6

7

8

9

10

11

12

WAYS TO GET X

1

2

3

4

5

6

5

4

3

2

1

P(X)

1/36

 

2/36

3/36

4/36

5/36

6/36

5/36

4/36

3/36

2/36

1/36


PROBABILITY DISTRIBUTION FUNCTION

 

When the number of possible values of a random variable increases, calculating probabilities individually becomes difficult. That’s why we use a Probability Distribution Function (PDF) to simplify the process.

 

  • If a random variable has many possible values, calculating the probability of each one separately is inefficient.

  • A Probability Distribution Function provides a formula or pattern to compute probabilities systematically.

  • It helps in handling large datasets and continuous variables, where exact probability calculations become impractical.


MEAN OF A RANDOM VARIABLE

 

The mean of a random variable , often called the expected value is essentially the average outcome of a random process that is repeated many times. More technically , it's a weighted average of the possible outcomes of the random variable, where each outcome is weighted by its probability of occurrence.

 

Example: Rolling a Die


Since each outcome is equally likely, the probability distribution is :

X

1

2

3

4

5

6

P(X)

1/6

1/6

1/6

1/6

1/6

1/6


We need to find the mean of the random variable X, which represents the average value over multiple trials.

 

Example Calculation

Suppose we have a set of numbers: 5, 3, 4, 5, 3, 3. The mean of these numbers is calculated as :

ree

We can also express this calculation differently :

ree

This approach provides a new perspective on calculating the mean while yielding the same result.

 

Mean of a Fair Die Roll

Since each face of the die appears with a probability of 1/6, the mean value of a die roll is :

ree

Thus, the expected value (mean) of rolling a fair six-sided die is 3.5 .

ree

Variance of Random Variable

 

The variance of a random variable is a statistical measurement that describes how much individual observations in a group differ from the mean (expected value).

 

Example: Rolling a Die

Since each outcome is equally likely, the probability distribution is:

 

X

1

2

3

4

5

6

P(X)

1/6

1/6

1/6

1/6

1/6

1/6

 

Now, we will calculate the variance.

As we know, the standard formula for variance is :

ree

CONTINGENCY TABLE

 

A contingency table (also known as a cross-tabulation or crosstab) is a type of table used in statistics to display the frequency distribution of variables. It is commonly used to analyze the relationship between two categorical variables.

 

Structure of a Contingency Table


A contingency table is structured in a matrix format, where:

  • Rows represent categories of one variable.

  • Columns represent categories of another variable.

  • Cells contain the frequency counts (number of occurrences) for each combination of row and column variables.


Example of a Contingency Table


Consider a study that examines the relationship between gender and preference for a product.

Gender

Likes Product

Dislikes Product

Total

Male

30

20

50

Female

25

25

50

Total

55

45

100

Types of Contingency Tables


  1. 2×2 Contingency Table – Contains two rows and two columns (excluding totals).

  2. Larger Contingency Tables – Can have multiple rows and columns if more categories are involved.



bottom of page