ASSESSMENT, MEASUREMENT, EVALUATION & RESEARCH
Threats to External and Internal Validity

Developed by: W. Huitt, J. Hummel, D. Kaeck
Last revised: January, 1999


Return to: | Research | Science: A Way of Knowing | Educational Psychology Interactive


External Validity

External validity consists of a determination whether the results of the experiment can be generalized to an entire population from which the samples were drawn in the study. The answer to the question of generalizability is not mathematically or inductively fully attainable. The methods of subject selection and/or control procedures and type of design strengthen inferences that the findings are representative but the induction remains inconclusive. Therefore, scientists place great emphasis on replication of the results through repeated experimentation to bolster the generalizability of the initial findings.

Internal Validity

Internal validity is even more basic since it refers to whether it can be concluded that the independent variable produced the differences observed. The matter of external validity is secondary to and dependent upon the demonstration of adequate attention to the threats to internal validity. Campbell and Stanley (1963) describe eight such threats.

  1. History effects refer to the measurement of behavior at different points in time which could result in differences reflecting the impact of the independent variable or extraneous and unwanted effects occurring as a result of cultural change (war, famine) over which the experimenter has no control. History is a threat to conclusions drawn from longitudinal studies. The greater the period of time that elapses between measurements, the more the risk of a history effect. History is NOT a threat to cross-sectional studies which are conducted at one time point.
  2. Maturation alone may produce changes across time (nervous system growth or becoming fatigued) which can produce behavioral changes unrelated (to an indeterminant degree) to experience or the impact of an experimental (experiential) variable. Thus, the vital and continuing issue of the nature-nurture controversy is evident.
  3. The selection threat translates to the cohort effect for developmentalists. In cross-sectional studies, comparing individuals who are in their 30's with individuals who are in their 70's on some task or opinionaire offers the risk that behavioral differences have actually nothing to do with age, rather they may be the result of differing educational, cultural and nutritional/health habits which characterized living conditions differently for each generation. Note the fact that a classification variable (age) is being examined within a cross-sectional approach. Many developmental researchers argue that "age" as a variable is irrelevant in any case since the real interest lies in experiential variables as well as neuro-maturational factors and their interplay.
  4. Testing effects consist of either reactivity as a result of testing or practice/learning from exposure to repeated testing. Longitudinal studies which require participants to take certain tests on a number of occasions are subject to this threat to internal validity.
  5. Statistical regression occurs where repeated measures are used and is particularly evident when participants are selected for study because they are extreme on the classification variable of interest, e.g. intelligence. Test-retest scores tend to systematically drift to the mean rather than remain stable or become more extreme. Regression effects may obscure treatment effects or developmental changes.
  6. Instrumentation problems are more of a concern in longitudinal studies where over significant periods of time researchers leave the study and testing instruments become invalid because of cultural change (tests are typically revised/renormed every 10 years). Change in experimenters may introduce different observers or techniques which could alter the continuity of measurement.
  7. Mortality or attrition of subjects is a major threat to a lengthy longitudinal study since the sample remaining at the end of the study is unlikely to be comparable to the initial sample. For example, the surviving sample is likely to be healthier, more stable in lifestyle and so forth.
  8. Biases in sample selection are threats to both the cross-sectional and longitudinal approaches although, because of the cost and time commitment, they are devastating to the conclusions drawn from the longitudinal study. Scientists have questioned the famed Terman study of giftedness on these grounds since teachers participated in the initial selection of children.

Return to:

All materials on this website [http://www.edpsycinteractive.org] are, unless otherwise stated, the property of William G. Huitt. Copyright and other intellectual property laws protect these materials. Reproduction or retransmission of the materials, in whole or in part, in any manner, without the prior written consent of the copyright holder, is a violation of copyright law.