Beware of False Positives
At the recent Data Science Summit, keynote speaker Nate Silver, founder of FiveThirtyEight, provided a warning to folks who want to use analytic modeling to drive decision-making. Nate talked about the dark side of statistical modeling, and the potential costs of Type I and Type II errors (oh, oh, here comes a lecture on statistics!!). A key aspect in using statistical modeling to drive decision making is to understand 1) the cost of making inaccurate decisions (Type I and Type II errors), and 2) the ramifications of spending more money to reduce the probability of making Type I and Type II errors. Note: a special thank you again to Dr. Pedro DeSouza for his help to ensure the accuracy and legibility of this blog.
Explaining Type I and Type II Errors
In statistical test theory, the notion of statistical error is an integral part of hypothesis testing. The statistical test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default “state of nature” (e.g., “this person is healthy”, “this accused is not guilty”, or “this product is not broken”). The alternative hypothesis is the negative or opposite of the null hypothesis (i.e., “this person is not healthy”, “this accused is guilty” or “this product is broken”). Two types of errors can occur as a result of the statistical model:
- A Type I error (false positive or false alarm) occurs when the null hypothesis is true, but it is rejected. In other words, the statistical model indicates that a given condition is present (condition is true, rejecting the null hypothesis) when it actually is not present.
- A Type II error (false negative) occurs when the null hypothesis is false but the test fails to reject it.
Based on the real-life consequences of an error, one type may be more serious than the other. For example, NASA engineers would prefer to throw out an electronic circuit that is really fine:
The null hypothesis is that the circuit is fine, and in reality the circuit is truly fine. But the statistical model indicates that it might be broken so the circuit is thrown out. This is a Type I error, incorrectly rejecting the null hypothesis when it is true.
But it is better to throw out a good circuit that the analytic model indicates might be broken than to accept that a circuit is likely not broken on a spacecraft when the circuit is actually broken:
The null hypothesis is that the circuit is fine when in reality it is broken. But the statistical model indicates that it is likely not broken so the circuit is used anyway. This is a Type II error, failing to reject the null hypothesis when it is false.
In this situation, a Type I error raises the project cost, but a Type II error could be catastrophic and risk the entire mission.
The chart below shows the four different states that can exist out of an analytic model, and the Type I and Type II error situations.
Cost of Type I and Type II Errors
To determine the economic cost of statistical modeling effectiveness, we first need to quantify the costs of Type I and Type II errors. Second, we need to determine the costs and benefits of investing more money and time to improve the statistical model to reduce the Type I and Type II errors.
For example, let’s say that we are trying to determine the potential impact of Type I and Type II errors on a decision whether or not a woman should be subjected to more cancer treatment. A medical test would be run to estimate the likelihood that the woman has cancer (based upon parental history, lifestyle, and potentially other predictive variables). The results of the medical test and the corresponding statistical model would yield the four following statistical states (see table below):
The first step is to understand the “costs” associated with the Type I and Type II errors that would come out of the statistical model (see table below):
The second step would be to determine the additional costs in gathering more data in order to reduce the Type I and Type II errors. It is key to note that there is likely no way to totally eliminate Type I and Type II errors, so it becomes a trade-off of the costs of the Type I and Type II errors, and the costs to reduce the Type I and Type II errors (see chart below).
Both Type I and Type II errors cause problems for individuals and corporations. A false positive (with null hypothesis of health) in medicine causes unnecessary worry or treatment, while a false negative gives the patient the dangerous illusion of good health and the patient might not get an available treatment. A false positive in manufacturing quality control (with a null hypothesis of a product being well-made), discards a product, which is actually well made, while a false negative stamps a broken product as operational. A false positive (with null hypothesis of no effect) in scientific research suggest an effect, which is not actually there, while a false negative fails to detect an effect that is there.
Minimizing statistical modeling errors is not a simple issue; for any given sample size the effort to reduce one type of error generally results in increasing the other type of error. The decision to spend the extra money to improve your statistical model is dependent upon the costs associated with Type I and Type II errors, and the costs to reduce those errors.