fbpx

How Strong are Your Statistical Results? The Concept of Effect Size

In a previous post, I spoke about what a statistically significant result means and doesn’t mean. A significant result may mean very little, not necessarily anything to inspire you to throw a party or make a supportive parent proud.

A significant result does not tell you how strong – or meaningful – your result is.

Why can a significant result not be meaningful?

Because the computation of statistical significance depends on your sample size, trivial differences may be significant. For example, with a sample size of 300, a correlation of .1 would be significant (p < .05) but explain a mere 1% of the variance in your outcome variable. In many situations, this result may not be practically or meaningfully significant. I recently came across a report on a large survey of a few thousand observations. It was littered with ‘p < .001′ and interpretations of the associated significant relationships but never mentioned the small magnitudes and practical meaninglessness of these results.

All a significant result tells you is that your correlation, mean difference, or other effect is probably nonzero, i.e., you probably have some effect. Thus, it is very important to determine the meaningfulness or strength of the effect.

How can we measure the strength of an effect?

Happily, we have the effect size, a safety net which tells us the size of an effect independent of its significance and the sample size. Using the effect size, we can assess how strong a relationship or difference is.

What is the effect size?

The effect size is the meaningfulness or practical significance of a difference or relationship.

The effect size was pioneered by Jacob Cohen in 1962 in his first book on Power Analysis. It is one of four interrelated variables that together determine the probability of rejecting the null hypothesis correctly, i.e., finding a true significant result.

These four variables are the sample size, the significance level, the effect size, and statistical power. Given any three of the four variables, one can determine the fourth. Thus, in any quantitative dissertation proposal or methodology chapter, one should do a power analysis to estimate the required sample size by specifying the values of the other three variables. Cohen provides tables for sample size calculations. There is also sample size calculation software that does so. G*Power is an example of freely available power analysis software.

Or, at a minimum, perform a post hoc power analysis to calculate the power of your statistical tests retrospectively. Again, software packages may have these available at the press of a button.

Now back to the effect size.

How do you judge the size of an effect?

In his books, Cohen provides formulae for standardised indices with cutoff values for assessing the magnitude of various effects. Additionally, in 1992, he published a “bare bones treatment” of power analysis and effect size, titled “Quantitative methods in psychology: A power primer” for eight popular statistical tests. It is accessible, friendly, and very useful.

How are the effect size cutoffs determined?

Cohen defines a medium effect size as “likely to be visible to the naked eye of a careful observer”, a small effect size as “noticeably smaller than medium but not so small as to be trivial”, and a large effect size as the same distance above the medium value as the small effect size is below it.

Although he determined these cutoffs subjectively, they have become a convention for assessing the magnitude of relationships and differences. For example, the effect size index for the difference between two independent means, d, has cutoff values of .2, .5, and .8 for small, medium, and large effect sizes, respectively. Cutoffs for the correlation coefficient, r, are .1, .3., and.5, for small, medium, and large relationships, respectively.

Cohen’s 1992 article has indices and cutoff values for six other tests of relationships and differences, and the full range of effect sizes is covered in his books. Effect size calculations are also often provided in the output of statistics programmes.

These effect sizes and their criteria are indispensable in quantitative data analysis. They are also often a requirement for publishable quantitatively based manuscripts.

I’ve saved a copy of Cohen’s 1992 article in my ‘Gold’ folder where I keep the academic treasures I come across.

Contact me at [email protected] if you need help with your methodology, quantitative data analysis, or any aspect of your dissertation.