
Probability Part - 2
- Aryan

- Mar 12
- 5 min read
JOINT PROBABILITY
Let's say we have two random variables, X and Y. The joint probability of X and Y , denoted as P(X=x, Y=y) , is the probability that X takes the value x and Y takes the value y simultaneously.
Let X be a random variable representing the passenger class (Pclass). It can take values from the set {1,2,3} corresponding to the three passenger classes.
Let Y be a random variable representing the survival status of a passenger. It can take values from the set {0,1}, where 0 indicates that the passenger did not survive, and 1 indicates survival.
To analyze the joint probabilities of these variables, we construct a contingency table based on the Titanic dataset :

Understanding the Contingency Table
The contingency table displays the number of passengers corresponding to different combinations of X (Pclass) and Y (Survived).
For example:
The value 80 in the table represents passengers who were in 1st class (Pclass = 1) and did not survive (Survived = 0).
The value 87 represents passengers who were in 2nd class (Pclass = 2) and survived (Survived = 1).
Calculating Joint Probabilities
The joint probability of specific values of X and Y is given by:

Using the data from the contingency table:

Similarly, we can compute joint probabilities for all other combinations of X and Y.
Interpreting the Probability Table
The final probability table represents the joint probabilities of all possible combinations of Pclass and survival status. These probabilities help us understand the relationship between passenger class and survival chances in the Titanic dataset.

MARGINAL PROBABILITY
Marginal probability, also known as unconditional probability, refers to the probability of an event occurring irrespective of the outcome of other events. When dealing with random variables, the marginal probability of a variable is simply the probability of that variable taking a certain value, regardless of the values of other variables.
Definition
For a discrete random variable X , the marginal probability of X=x is given by:

Similarly, for a discrete random variable Y, the marginal probability of Y=y is:

This means that marginal probability is obtained by summing the joint probabilities over the other variable.
Example: Titanic Passenger Data
Let:
X be the Pclass (Passenger Class: 1st, 2nd, or 3rd) of a passenger.
Y be the survival status (0 = died, 1 = survived) of a passenger.
The following table represents the number of passengers categorized by survival status and passenger class:
549 represents the total number of people who died, irrespective of their passenger class.
342 represents the total number of people who survived, irrespective of their passenger class.
216 represents the number of people in 1st class, irrespective of whether they survived or not.
891 is the total number of passengers.
Calculating Marginal Probabilities
To find the marginal probability of a category, we divide the total count of that category by the overall total (891).
Probability of Death (Y=0)
P(Y = 0) = 549/891 = 0.6162
Probability of Survival (Y=1)
P(Y = 1) = 342/891 = 0.3838
Probability of being in Pclass 1 (X=1)
P(X = 1) = 216/891 = 0.2424
Probability of being in Pclass (X=2)
P(X = 2) = 184/891 = 0.2065
Probability of being in Pclass (X=3)
P(X = 3) = 491/891 = 0.5511
Interpretation
P(Y=0) = 0.616 means that 61.6% of passengers died, regardless of their passenger class.
P(X=2) = 0.207 means that 20.7% of passengers were in second class, irrespective of whether they survived or not.
CONDITIONAL PROBABILITY
Conditional probability is a measure of the probability of an event occurring , given that another event has already occurred.
If the event of interest is A and event B has already occurred, the conditional probability of A given B is usually written as P(A|B).

Three unbiased coins are tossed . What is the conditional probability that at least two coins show heads , given that at least one coin shows heads ?
Conditional Probability of At Least Two Heads Given at Least One Head

Sample Space:
Total outcomes: 8
Outcomes for B (at least one head): 7
Outcomes for A∩B (at least two heads): 4

Two fair six-sided dice are rolled. What is the conditional probability that the sum of the numbers rolled is 7, given that the first die shows an odd number ?

We need to find P(A|B) , where:
A is the event that the sum is 7.
B is the event that the first die is odd.
Total outcomes: 36
Favorable cases for B (first die is 1,3,5): 18
Favorable cases for A∩B (sum is 7, first die is odd): 3

Two fair six-sided dice are rolled , denoted as D1 and D2 . What is the conditional probability that D1 equals 2, given that the sum of D1 and D2 is less than or equals to 5 ?
We need to find P(D1=2 | D1 + D2 ≤ 5)
The favorable outcomes for D1 + D2 ≤ 5 are:
(1,1),(1,2),(1,3),(1,4),(2,1),(2,2),(2,3),(3,1),(3,2),(4,1)
Total: 10 cases.
The cases where D1=2 are:
(2,1),(2,2),(2,3)
Total: 3 cases.
Thus, the probability is :

INTUTION BEHIND THE CONDITIONAL PROBABILITY FORMULA
The intuition behind the formula for conditional probability, P(A | B) = P(A ∩ B) / P(B), is based on the concept of reducing our sample space.
The denominator, P(B), is the probability of event B occurring. When we say we want to know the probability of A given B, we’re effectively saying that B has occurred and therefore B is our new “universe” or sample space. So we’re not considering cases when B didn’t occur anymore, and we’re normalizing by the probability of B.
The numerator, P(A ∩ B), is the joint probability of A and B, meaning both A and B occur. So in the context of our new universe where B has occurred, P(A ∩ B) represents the cases where A also occurs.
By dividing the joint probability P(A ∩ B) by the probability of B (P(B)), we effectively find the proportion of times that A occurs given that B has occurred.
In summary, the conditional probability of A given B is just the joint probability of A and B happening (A and B together in the “universe” where everything happens), normalized by the probability of B (the new “universe” where only B happens).
Independent Events
Independent events are events where the occurrence of one event does not affect the occurrence of another.
Examples:
Flipping a coin and rolling a die.
Drawing a card with replacement from a deck.
Dependent Events
Dependent events are events where the occurrence of one event does affect the occurrence of another.
Examples:
Drawing a card without replacement from a deck.
Picking a marble from a bag and then picking another without putting the first one back.
Mutually Exclusive Events
Mutually exclusive events are events that cannot both occur at the same time. In other words, if one event occurs, the other cannot.
Examples:
Getting a head and a tail in a single coin flip (only one can happen).
Drawing a red card and a black card in a single draw from a deck.

BAYES THEOREM
Bayes' Theorem is a fundamental concept in the field of probability and statistics that describes how to update the probabilities of hypotheses when given evidence. It's used in a wide variety of fields, including machine learning, statistics, and game theory.
The theorem is named after Thomas Bayes, who introduced an early version of the rule in his work on probability theory. Bayes' Theorem provides a way to revise existing predictions or theories (update probabilities) given new evidence.

MATHEMATICAL PROOF



