avatarAaron Zhu

Summary

Hypothesis testing in statistics is a method for making inferences about population parameters using sample data, involving the specification of null and alternative hypotheses and assessing the likelihood of observing the sample data under the null hypothesis against a predetermined significance level to make evidence-based decisions.

Abstract

The article "What is Hypothesis Testing in Statistics: Motivation and Interpretation" delves into the concept of hypothesis testing as a fundamental statistical technique for decision-making based on sample data rather than intuition. It emphasizes the importance of evidence-based conclusions by comparing sample statistics to population parameters through the framework of null and alternative hypotheses. The process acknowledges the possibility of errors, specifically Type I and Type II errors, and uses the significance level (α) to control the probability of incorrectly rejecting a true null hypothesis. The article also discusses the practicality of hypothesis testing when collecting complete data is impractical, the role of the critical value in decision-making, and the various fields where hypothesis testing is applied, including medicine, engineering, and social sciences.

Opinions

  • The author posits that hypothesis testing is essential due to the impracticality of collecting complete population data, which can be costly and time-consuming.
  • It is noted that decisions should not be based solely on sample means that seem to favor the alternative hypothesis; instead, a structured approach using hypothesis testing is necessary to avoid errors.
  • The significance level (α) is a predetermined threshold that quantifies the acceptable risk of making a Type I error, with common values being 0.1, 0.05, and 0.01.
  • The article suggests that the use of hypothesis testing is superior to gut feeling as it provides a systematic way to manage the risk of making incorrect decisions.
  • The author advocates for the power of hypothesis testing in making informed decisions across various disciplines, highlighting its role in enabling statisticians to determine the validity of claims with limited data.
  • The piece encourages readers to critically evaluate statistical claims by considering whether they are supported by hypothesis testing.
  • The author provides further reading on related statistical concepts, indicating a commitment to educating readers on the broader applications and implications of statistical analysis.

What is Hypothesis Testing in Statistics: Motivation and Interpretation

The Power of Evidence-Based Decision Making

Photo by Scott Graham on Unsplash

Are you curious about how statisticians determine whether a claim is true or false? Do you want to know how statisticians make decisions based on data instead of just gut feeling? Look no further, because today we’re diving into the world of hypothesis testing!

What is Hypothesis Testing in Statistics?

In statistics, hypothesis testing is a method for making inferences about a population parameter based on a sample statistic.

It involves specifying a null hypothesis (e.g., no difference or relationship between the variables being tested) and an alternative hypothesis (which represents the opposite assumption). For example,

  • The null hypothesis: The average household income in Los Angeles is $80,000. (in other words, the average household income in Los Angeles is not different from $80,000)
  • The alternative hypothesis: The average household income in Los Angeles is greater than $80,000.

If our sample data is in favor of the alternative hypothesis (e.g., the likelihood of observing the sample data under the null hypothesis is lower than the significance level, α), we reject the null hypothesis.

Otherwise, we don’t reject the null hypothesis because we don’t have sufficient statistical evidence to do so.

What is the motivation for Hypothesis Testing?

But why do we need hypothesis testing? Can’t we just collect all the data and make a decision then?

Unfortunately, it’s not always practical or possible to collect all the data. Plus, collecting all the data can be costly and time-consuming. Hypothesis testing allows us to make decisions and draw conclusions with our limited data.

Can we just compute some statistics from the sample (e.g., sample mean), if they are in favor of the alternative hypothesis, then we reject the null hypothesis?

The answer is obviously NO.

The sample data is always going to be different from the population data. And the conclusions we draw might vary from one sample to another. One sample might be in favor of the null hypothesis while another set of samples might support the alternative hypothesis.

Since we’re using the sample information to draw a conclusion about the population parameter. We would need to consider the possibility of errors.

  • A type I error (α) represents rejecting the null hypothesis when it is actually true (i.e., false positive)
  • A type II error (β) represents accepting the null hypothesis when it is false. (i.e., false negative)

The hypothesis test is a useful tool to make decisions based on sample data while assessing and minimizing the likelihood of these errors.

What is the predetermined significance level (α) in Hypothesis Testing?

Is the significance level the probability of the null hypothesis being true or false?

The answer is NO.

But what exactly is the significance level (α)?

Before a hypothesis test, we acknowledge that there is a probability of making a type I error when the null hypothesis is correct because we’re only relying on the limited sample data to make a conclusion about the entire population.

We can predetermine the probability of making a type I error by setting the Significant Level (α).

The typical significance levels are 0.1, 0.05, and 0.01. For example, α = 0.05 means if we repeat the hypothesis tests many times with different sets of sample data, we expect that 5% of the time we incorrectly reject the null hypothesis when it is true (type I error), and 95% of the time we won’t reject the null hypothesis.

When to reject or not reject the null hypothesis?

In our example above,

  • The null hypothesis: The average household income in Los Angeles is $80,000.
  • The alternative hypothesis: The average household income in Los Angeles is greater than $80,000.

If we collect sample data, then compute the sample average. How high does the sample average need to be, so that we would reject the null hypothesis?

Is $90,000 high, or is $85,000 high enough? We would definitely need more than just gut feeling. It’s important to come up with a reasonable threshold.

In hypothesis testing, the significance level (α) gives us the necessary information to calculate the critical value (threshold) which we can use to determine whether we would reject or not reject the null hypothesis.

One advantage of using this method is we are able to manage the potential risk of making type I errors when using the critical value to make decisions.

Image by author

How to Find a Critical Value in hypothesis testing?

  • μ is the population parameter (e.g., population mean) in the null hypothesis.
  • σ is the population standard deviation. If σ is unknown, we can estimate it using sample standard deviation, s.
  • n is the sample size
  • Z is the Z statistics associated with a given α. If σ is unknown or the sample size is less than 30, we would use T statistics to produce a more realizable result.

As you can see, hypothesis testing is a powerful tool that allows us to make decisions and draw conclusions with limited data. It’s used in a wide range of fields such as medicine, engineering, and social sciences. Next time you see a statistic being thrown around, ask yourself if it’s supported by a hypothesis test!

If you would like to explore more posts related to Statistics, please check out my articles:

Thank you for reading !!!

If you enjoy this article and would like to Buy Me a Coffee, please click here.

You can sign up for a membership to unlock full access to my articles, and have unlimited access to everything on Medium. Please subscribe if you’d like to get an email notification whenever I post a new article.

Hypothesis Testing
Machine Learning
Statistics
Statistical Analysis
Statistical Inference
Recommended from ReadMedium