fbpx

Reliability and Dependability in Your Dissertation

 

The reliability and dependability of a beloved other is a privilege. In a dissertation, these qualities are required.

Quantitative dissertations require you to describe reliability and qualitative dissertations require you to speak of dependability.

There are two types of reliability. The first type is the reliability of the overall study; the other type is the reliability of the scores of your measurement instrument. Many students confuse them.

The reliability or dependability of your research

Let us begin with the reliability or dependability of your study as a whole. Here, we are speaking of the trust or trustworthiness of your empirical research.

Even if you have selected your research approach, design, and sampling methods appropriately to answer your research question and the scores of your instruments are reliable, your research may not be considered reliable or dependable if you keep no record of your interviews, if you are vague about who you interviewed and when the interviews took place, and if there are shortcomings in describing how you performed data cleaning and analysis.

For your research to be considered reliable or dependable, you need to provide details and evidence of your research setting, sample information, procedure, data cleaning methods, findings, and more.

Unreliable research is one of the main reasons for the replication crisis in science, where several researchers have failed to reproduce the results of others, and even their own. Shortcomings in documentation are also a key reason for the retraction of journal articles.

Keep meticulous records of every aspect of your research and document them in your dissertation, either in the main body or in the appendices. Describe your record keeping in the section of your dissertation under the reliability of your study.

Moving on to the reliability of measurement instruments, specifically in quantitative or mixed methods research.

Reliability or dependability of the scores of the measurement instrument

One more clarification. It is incorrect to speak of the reliability of a measurement instrument or test. Rather, we speak of the reliability of the scores of the measurement instrument.

In this context, the reliability of the scores of a measurement instrument refers to their consistency. Their consistency upon repeating the measurement (test-retest reliability), the consistency of scores on different versions of the same test (parallel forms reliability), the consistency of scores of different raters of the same individual (inter-rater reliability), and internal consistency reliability measured by Coefficient alpha or Cronbach’s alpha.

As many or most quantitative dissertations do not involve repeated measures, parallel forms, and multiple raters, students usually describe the internal consistency reliability of their measurement instruments’ scores. They use a measure called Cronbach’s alpha to report the reliability of the scores of their measurement scales.

However, most students report alpha incorrectly. In fact, alpha is one of the most frequently misunderstood and misinterpreted statistics in theses, dissertations, proposals, and other academic publications involving measurement 🥲.

What is Cronbach’s alpha?

Cronbach’s alpha is a measure of the internal consistency of the scores of a measurement instrument. It is the most commonly reported type of reliability.

Suppose the internal consistency of the scores of a measurement instrument is low. In that case, there is heterogeneity—or lack of consistency—between the scores of the items of the instrument, making it uncertain what the total score is measuring and implying the presence of random error in its scores.

Why is Alpha misunderstood?

First, as we have discussed, Cronbach’s alpha refers to the reliability of measurement, not the reliability of the overall research or study. However, many students write about alpha in the section intended to discuss the reliability of the research. The discussion of alpha must be confined to the section on instrumentation.

Second, some students conducting qualitative interviews say they will calculate alpha for the questions in their qualitative interview instrument. However, alpha can only be computed for a quantitative measurement instrument.

Third, internal consistency does not imply unidimensionality. A high alpha value does not mean that the items in an instrument measure the same thing. Rather, it means that every item in the instrument measures something similar to some of the other items or is correlated, but the instrument may still be multidimensional.

Fourth, unless the measurement instrument is unidimensional (check this via factor analysis or similar), the total score may not be interpretable, even if alpha is high.

Fifth, it cannot be assumed that an instrument with an alpha value of .7 (a commonly used criterion) indicates a sufficient measure of reliability or internal consistency. This value is arbitrary and increases with the number of items in the instrument.

Sixth, a very high alpha value is not necessarily good because it may imply redundant items in the instrument that should be dropped. If alpha has a value of .9 or above, you should drop redundant items from the scale.

Seventh, you cannot assume that published reliability values will apply to your sample and test situation. You need to quote the alpha value in your study in addition to those published and clarify the different samples on which they are based.

Finally, it is inappropriate to calculate the alpha value for a knowledge test comprising different facets or dimensions. Calculate alpha per facet. Alpha is meaningful only if calculated using items designed to measure the same construct.

​​​​​​​Why is the reliability of your scores so important?  

First, it is important to have reliable scores because poor score reliability reduces the potential for evaluating change. Any instrument that produces unreliable scores measures inconsistently. Therefore, changes in inconsistent scores cannot imply real change. This is an obvious problem in testing the effect of an intervention.

Second, the error variance in the scores of an unreliable measurement instrument affects what is being measured because the scores are full of random errors. This means that the validity of the scores—what the scores of the scale are supposed to be measuring—is compromised. Therefore, measurement instruments that produce scores with poor reliability will also have compromised validity. Another real problem.

The third, fourth, and fifth reasons also concern random error. Random error, or noise, in unreliable scores reduces the power of statistical tests. Recall that power is the probability of finding a significant difference if it exists. However, this is increasingly difficult with random error in our scores because we need larger differences for significance. Therefore, unreliability reduces the power of statistical tests and, in turn, reduces the associated effect sizes. This also reduces the observed correlations between our scores and other measures, so that the instrument is a poorer predictor of outcomes than it should be. These are serious consequences of research involving poor score reliability.

In summary, it is essential that your overall research is reliable and that the measurement instruments that you use in your dissertation measure reliably. I hope to have convinced you of this.

Sources:  Bloomberg, 2023; Cortina, 1993; Cronbach, 1951; Gardner, 1995; Kline, 2011; Revelle & Condon, 2019; Taber, 2018.

Contact me at [email protected] if you need help with the methodology, analysis, or any aspect of your dissertation.