There are several ways that the internal validity of your research can be compromised. You should assess their likelihood before conducting your study and decide afterward whether they were a problem.
History: Events occurring between multiple measurements or different experiences by two groups can affect the outcome of a study.
Maturation: If your measurements occur over a long time (as in longitudinal studies), your participants may mature and develop physically or psychologically. Over the short term, maturation refers to factors like fatigue, boredom, etc.
Testing: When participants become practices regarding the dependent variable, the threat to internal validity called "testing" has occurred. If people become alerted to the nature or purpose of the study because of the testing situation, any changes in behavior may not be due to the IV. It is helpful to use nonreactive measurements.
Insrumentation: Changes in the way that DVs are measured, whether mechanically or by different criteria by a person, result in the problem that your data are not consistent.
Statistical Regression: Sometimes scores that are extreme fall far from the mean not because the person measured is really exception, but rather because of high levels of random error. On repeated testing, the scores will migrate toward the mean, where they "should be". You might observe changes that are not experimentally reliable.
Selection: If you choose participants for different groups who are not comparable to begin with, you results will be suspect because differences in measurements may reflect initial differences.
Mortality (or Attrition): If you rely on multiple measurements over a long period of time, you may find that some people do not return for repeated measurements. The people who do not come back may differ in important ways from those that do. Consequently, any conclusions you draw may reflect a different population at the end relative to the beginning.
***************************************
Interactions with Selection (based on maturation, history, or treatment): Sometimes we may want to compare groups that, over time, show differences not because of the effect of the IV, but because one group may be affected differently by the threats to internal validity relative to another. For instance, if one group differs from another by being more sensitive to the testing situation, there may be a selection by testing interaction.
Diffusion or Imitation of Treatments: Sometimes if one group becomes aware that another group is getting a different treatment, it might cause the first group to act differently, imitating or copying the behaviors of the second.
There are several factors that can reduce a study's external validity, limiting the extent to which you can generalize your results and conclusions beyond the confines of your study. The types of generalization that we are interested in include extending the results to other populations, other situations, and other times.
THREATS BASED ON RESEARCH METHODS
Interaction of Testing and Treatment: Does a pretest sensitize participants to the nature of the research? If so, the results may not generalize to people who are not similarly sensitized.
Interaction of Selection and Treatments: When a treatment is effective only with a specific group of participants (e.g., students), generalization may be undermined. It is not always clear whether you should generalize from one group to another, although experience and research helps answer the question.
Reactive Arrangements: If participants begin to change responses due to simple exposure to the research setting, independent of the IV, the generalizability of the results will be suspect. Included in this problem are demand characteristics, which reflect participants' desires to figure out what a study is "really" about.
Multiple Treatment Interference: In repeated measures designs, when testing in one condition influences a participants responses in a subsequent condition, the generalizability of the results may be problematic.
THREATS BASED ON PARTICIPANTS
Convenience Sampling: When we use a sample at hand rather than random or probabilistic sampling, there is no guarantee that the behaviors of these people represent behaviors of other groups. The limited samples we use in psychology generally involve young, white, well-educated, female college students. (Research with animals often involves only white rats.)
Return to the Research Methods home page
(Last modified January 15, 2004)