Should we use one-sided or two-sided P values in tests of significance?

John Ludbrook

doi:10.1111/1440-1681.12086

Should we use one-sided or two-sided P values in tests of significance?

Clin Exp Pharmacol Physiol. 2013 Jun;40(6):357-61. doi: 10.1111/1440-1681.12086.

Author

John Ludbrook¹

Affiliation

¹ Department of Surgery, The University of Melbourne, Melbourne, Victoria, Australia. ludbrook@bigpond.net.au

PMID: 23551169
DOI: 10.1111/1440-1681.12086

Abstract

'P' stands for the probability, ranging in value from 0 to 1, that results from a test of significance. It can also be regarded as the strength of evidence against the statistical null hypothesis (H₀). When H₀ is evaluated by statistical tests based on distributions such as t, normal or Chi-squared, P can be derived from one tail of the distribution (one-sided or one-tailed P), or it can be derived from both tails (two-sided or two-tailed P). Distinguished statisticians, the authors of statistical texts, the authors of guidelines for human and animal experimentation and the editors of biomedical journals give confusing advice, or none at all, about the choice between one- and two-sided P values. Such a choice is available only when there are no more than two groups to be compared. I argue that the choice between one- and two-sided P values depends on the alternative hypothesis (H₁), which corresponds to the scientific hypothesis. If H₁ is non-specific and merely states that the means or proportions in the two groups are unequal, then a two-sided P is appropriate. However, if H₁ is specific and, for example, states than the mean or proportion of Group A is greater than that of Group B, then a one-sided P maybe used. The form that H₁ will take if H₀ is rejected must be stipulated a priori, before the experiment is conducted. It is essential that authors state whether the P values resulting from their tests of significance are one- or two-sided.

MeSH terms

Animals
Data Interpretation, Statistical*
Humans
Probability*
Research Design*