Effect size: How strong are your statistical results, even your significant ones?

Effect size: How strong are your statistical results, even your significant ones?

In a previous post, I spoke about what a significant result means and doesn’t mean. A significant result may mean very little, not necessarily anything to inspire you to throw a party or make a supportive parent proud.

What it doesn’t tell you is how strong, or meaningful, your result is.

Why can a significant result not be meaningful?

Because the computation of statistical significance depends on your sample size, trivial differences may be significant. For example, with a sample size of 300, a correlation of .1 would be significant (p < .05) but explain a mere 1% of the variance in your outcome variable. In many situations, this result may not be practically or meaningfully significant. I recently came across a report on a large survey of a few thousand observations that was littered with “p < .001” and interpretations of the associated significant relationships, with no mention of their small magnitudes and practical meaninglessness.

All a significant result tells you is that your correlation, mean difference, or other effect is probably nonzero, i.e., you probably have some effect. However, it is very important to determine the meaningfulness of your result. You need to assess how strong the effect is.

How can we measure the strength of an effect?

Happily, we have the Effect Size, a safety net which tells us the size of an effect independent of its significance and independent of the sample size. Through the effect size, we can assess how strong a relationship or difference is.

What is the effect size?

The effect size is the meaningfulness, or practical significance, of a difference or relationship.

The effect size was pioneered by Jacob Cohen in 1962 in his first book on Power Analysis. It is one of four interrelated variables which together determine the probability of rejecting the null hypothesis correctly, i.e., finding a true significant result.

These four variables are the sample size, the significance level, the effect size, and statistical power. Given any three of the four variables allows one to determine the fourth. Thus, in any quantitative dissertation proposal, one should do a power analysis to estimate the required sample size by specifying the values of the other three variables. Cohen provides tables for sample size calculations. There is also sample size calculation software that does so. G*Power is an example of freely available power analysis software.

Now back to the effect size.

How do you judge the size of an effect?

In his books, Cohen provides formulae for standardised indices with cutoff values for assessing the magnitude of various effects. Additionally, in 1992, he published a “bare bones treatment” of power analysis and effect size, titled “Quantitative methods in psychology: A power primer” for eight popular statistical tests. It is accessible, friendly, and very useful.

How are the effect size cutoffs determined?

Cohen defines a medium effect size as “likely to be visible to the naked eye of a careful observer”, a small effect size as “noticeably smaller than medium but not so small as to be trivial”, and a large effect size as the same distance above the medium value as the small effect size is below it.

Although he determined these cutoffs subjectively, they have become a convention for assessing the magnitude of relationships and differences. For example, the effect size index for the difference between two independent means, d, has cutoff values of .2, .5, and .8 for small, medium, and large effect sizes, respectively. Cutoffs for the correlation coefficient, r, are .1, .3., and.5, for small, medium, and large relationships, respectively.

Cohen’s 1992 article has indices and cutoff values for six other tests of relationships and differences, and the full range of effect sizes is covered in his books. Effect size calculations are also often provided in the output of statistics programmes.

These effect sizes and their criteria are indispensable in quantitative data analysis. They are also often a requirement for publishable quantitatively based manuscripts. I’ve saved a copy of Cohen’s 1992 article in my ‘Gold’ folder where I keep the academic treasures that I come across.

Contact me at [email protected] if you need help with your quantitative data analysis.