Hypothesis testing¶
The goal of hypothesis testing is to answer a simple yes / no question about a population parameter. There are two types of hypothesis, $H_{0}$ the null hypothesis and $H_{A}$ the Alternate hypothesis.
The steps followed are:
- set up the hypothesis (null, alternate)
- choose $\alpha$ level (confidence interval)
- determine rejection region (on the z curve)
- compute the test statistic (p value based on z score)
- make a decision
Rules in hypothesis testing
- No equal sign in $H_{A}$. Only $\ne, <, >$ signs
- Put what you want to test in $H_{A}$, unless you violate rule 1, then you put that in $H_{0}$
- Believe $H_{0}$ unless the dataset shows otherwise
- when we make our decision, we either reject $H_{0}$ or fail to reject it.
Examples of formulating hypothesis¶
- Nitrate levels are unsafe if > 10ppm. Test if out water is unsafe on average.
- $H_{0} => \mu \le 10ppm$
- $H_{A} => \mu > 10 ppm$
- Test if a coin is fair.
- $H_{0} => p(h)=0.5$
- $H_{A} => p(h)\ne 0.5$ This is because alt hypothesis should not have equal sign.
Type 1, 2 errors¶
For a jury trial, our motto is innocent until proven guilty. Hence
$H_{0} => innocent$ as we reject H0 or fail to do so
$H_{A} => guilty$
Type 1 error: False positive
- we reject $H_{0}$ when it is still true
- $\alpha$ = p(type 1 error) = p(rejecting $H_{0}$ when it is still true)
Type 2 error: False negative
- $\beta$ = p(type 2 error) = p(failing to reject $H_{0}$ when $H_{A}$ is true)
In practice, we fix $\alpha = 0.5$ and calculate $\beta' = (1-\beta)$
Testing your hypothesis¶
You calculate the test statistic as $$TS = \frac{\bar x - \mu}{\frac{s}{\sqrt n}}$$ You either reject or approve the $H_{0}$ based on the value of the TS compared against the p value for the said $\alpha$ value (confidence interval)
Example
A sample of 49
batteries are tested for their limetimes. The SD is 15.0
, mean longivity is 1006.2
. Is it possible to claim the batteries last longer than 1000
hours on average?
$\bar x = 1006.2$, $n=49$, $s=15$, $\alpha = 0.01$ assumed. $$H_{0} => \mu \le 1000$$ $$H_{A} => \mu > 1000$$
Find Test Statistic $$TS = \frac{1006.2-1000}{15/\sqrt{49}}$$ $$TS=2.89$$ This is a right tailed hypothesis as we test if test statistic is > z score for the said alpha.
z score for $\alpha=0.01$ = 2.576 (for 99% CI) The TS is > z score. Hence reject $H_{0}$. Thus mean battery life > 1000 hours by significance.