Category:Tests of Statistical Significance

From Displayr
Jump to navigation Jump to search

Most analyses are not based on all the relevant data. Surveys, for example, typically involve collecting data from a sample of the total population. Customer databases typically only contain a subset of all the potential customers for an organization.

Analyses based on such subsets will usually produce results that are different to the results that would have been obtained if all possible observations in the population had been included. For example, even though there are more than 300 million Americans, you might only interview 200 of them in a particular survey, it is inevitable that the results from these 200 will differ from the results obtained if all 300 million were interviewed.

Statisticians have developed a variety of rules of thumb to help distinguish between results that are reflective of the population and results that are not. These rules of thumb are variously known as tests of statistical significance, statistical tests, hypothesis tests, and significance tests.

Statistical tests are used in two quite different ways:

  • To test hypotheses that were formulated at the time the research was designed (formal hypothesis testing).
  • To search through large quantities of data and identify interesting patterns (data exploration).

The formal hypothesis testing approach is prevalent in academic research. Data exploration is prevalent in commercial research. Of course, academic and commercial research make use of both approaches.

If you are unfamiliar with the theory of tests of statistical significance it is important to first read the section on formal hypothesis testing as it introduces many of the key concepts that determine how data exploration occurs.

Formal hypothesis testing

Tests of statistical significance were invented to help researchers test whether certain beliefs about how the world worked were or were not true. Examples of the type of problems addressed using formal hypothesis tests are:

  • Does social isolation increase the risk of suicide?
  • Will modifying the ingredients in a Big Mac lead to an increase or decrease in sales?

The key outcome of a formal hypothesis test is something called a p-value. Refer to Formal Hypothesis Testing for an explanation of formal hypothesis testing and p-values.

Data exploration

Most analyses of surveys involve the creation and interpretation of tables, which are usually referred to as crosstabs. An example is shown below. It is standard practice in the analysis of surveys to use statistical tests to automatically read tables, identifying results that warrant further investigation. In this example, arrows are used to indicate the results of the significance tests. Other approaches are to use colors and letters of the alphabet. See Crosstabs at mktresearch.org for a gentle introduction on how to read tables and Statistical Tests on Tables for more information on how to how statistical test are displayed on tables.

TableArrowsNoCorrection.png

The p-values computed by statistical tests make a variety of technical assumptions. One of the technical assumptions is that only a single statistical test is being conducted in the entire study. That is, the standard tests used to compute p-values all assume that the testing is not being used for data exploration. See Multiple Comparisons (Post Hoc Testing) for more detail on this problem and its solutions.

Also known as

Hunting through data looking for patterns is also known as exploratory data analysis, data mining, data snooping and data dredging.