Probability ranges from 0 to 1. The sum of P(A) and the opposite of A occuring is 1. For mutually exclusive events A and B, the Probability of either A or B ocurring is sum of their probabilities.
Mutually exclusive: Two events are considered mutually exclusive, if when an event is performed once, the occurrence of one of the events excludes the possibility of another.
For two independent events, probabilities of their union and intersection can be represented as
$$ P(A \cup B) = P(A) + P(B) - P(A \cap B) $$
If A and B are mutually exclusive, then $P(A \cap B) = 0$
The reason we negate P(A intersection B)
can be seen from the venn diagram below. Probabilities of A and B are (0.5 and 0.2). The probability of both A and B ocurring is 0.05. Thus to not double count the intersection, we negate it.
import matplotlib.pyplot as plt
%matplotlib inline
from matplotlib_venn import venn2
venn2(subsets = (0.45, 0.15, 0.05), set_labels = ('A', 'B'))
<matplotlib_venn._common.VennDiagram at 0x11a436ef0>
Conditional Probability¶
Generally, conditional probability is more helpful in explaining a situtation than general probabilities.
Given two events A
and B
with non zero probabilities, then the probability of A occurring, given that B has occurs is
$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$
and $$ P(B|A) = \frac{P(A \cap B)}{P(A)} $$
The $P(A/B)$ probability of A given that B occurs, is the probability of A and B occurring $P(A \cap B)$ to the probability of B occurring $P(B)$. Thus if A
and B
are mutually exclusive
, then there is no conditional probability.
Example Consider the case of insurance fraud. In table below, you are given insurance type and what rate of them are fraud claims.
import pandas as pd
df = pd.DataFrame([[6,1,3,'Fradulent'],[14,29,47,'Not Fradulent']],
columns=['Fire', 'Auto','Other','Status'])
df
Fire | Auto | Other | Status | |
---|---|---|---|---|
0 | 6 | 1 | 3 | Fradulent |
1 | 14 | 29 | 47 | Not Fradulent |
The total number of claims: 100, number of fraud claims: 10. Thus 10% of all claims are fraud. However, with additional information about type of claims, we can fine grain whether a given claim is fraud, if we knew the type of claim (predictor variable).
To answer the question, what is the probability that a claim is fraud, given that it is a Fire claim?:
$$ p(Fraud \ |\ fire \ policy) = \frac{p(fire \cap fraud)}{p(fire \ policy)} $$ $$ p(Fraud \ | \ fire \ policy) = \frac{0.06}{0.20} = .30 $$ or 30% of claims are fraud, given that they are of type fire.
Here, 30%
is called the conditional probability
and the general 10%
is called the unconditional
or marginal
probability. Clearly, knowing the conditional probability is of much higher value than knowing the unconditional probability.
Bayesian conditional probability¶
The Bayesian theorem builds on conditional probability, specifically on prior and posterior probabilities. It states that, if A
and B
are any events whose probabilities are not 0 or 1, then:
$$ P(A|B) = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|\bar A)P(\bar A)} $$
In reality, we expand the $\bar A$ case. Thus if $A_1 ,... A_k$ are mutually exclusive states of nature
and if $B_1 .. B_m$ are m
possible mutually exclusive observable events, then,
$$ P(A_i | B_j) = \frac{P(B_j | A_i)P(A_i)}{P(B_j | A_1)P(A_1) + P(B_j | A_2)P(A_2) + ... + P(B_j | A_k)P(A_k)} $$
Consider $A_1 ,... A_k$ as $k$ predictor variables in machine learning. The Naive Bayes classifier will build the conditional probabilities of $p(B_j|A_k)$ to later predict what would $p(A_i | B_j)$ be.