Wednesday, 29 October 2008

M ED 1.11 Validity in Research Thesis

Statistical Validity

In psychology, validity has two distinct fields of application. The first involves test validity, a concept that has evolved with the field of psychometrics but which textbooks still commonly gloss over in explaining that it is the degree to which a test measures what it was designed to measure. The second involves research design. Here the term refers to the degree to which a study supports the intended conclusion drawn from the results. In the Campbellian tradition, this latter sense divides into four aspects: support for the conclusion that the causal variable caused the effect variable in the specific study (internal validity), support that the same effect generalizes to the population from which the sample was drawn (statistical conclusion validity), support for the intended interpretation of the variables (construct validity), and support for the generalization of the results beyond the studied population (external validity).

An early definition of test validity identified it with the degree of correlation between the test and a criterion. Under this definition, one can show that reliability of the test and the criterion places an upper limit on the possible correlation between them (the so-called validity coefficient). Intuitively, this reflects the fact that reliability involves freedom from random error and random errors do not correlate with one another. Thus, the less random error in the variables, the higher the possible correlation between them. Under these definitions, a test cannot have high validity unless it also has high reliability. However, the concept of validity has expanded substantially beyond this early definition and the classical relationship between reliability and validity need not hold for alternative conceptions of reliability and validity. Within classical test theory, predictive or concurrent validity (correlation between the predictor and the predicted) cannot exceed the square root of the correlation between two versions of the same measure — that is, reliability limits validity.

Test validity can be assessed in a number of ways and thorough test validation typically involves more than one line of evidence in support of the validity of an assessment method (e.g. structured interview, personality survey, etc). The current Standards for Educational and Psychological Measurement follow Samuel Messick in discussing various types of validity evidence for a single summative validity judgment. These include construct related evidence, content related evidence, and criterion related evidence which breaks down into two subtypes (concurrent and predictive) according to the timing of the data collection.

Construct related evidence involves the empirical and theoretical support for the interpretation of the construct. Such lines of evidence include statistical analyses of the internal structure of the test including the relationships between responses to different test items. They also include relationships between the test and measures of other constructs. As currently understood, construct validity is not distinct from the support for the substantive theory of the construct that the test is designed to measure. As such, experiments designed to reveal aspects of the causal role of the construct also contribute to construct validity evidence.

Content related evidence involves the degree to which the content of the test matches a content domain associated with the construct. For example, a test of the ability to add two-digit numbers should cover the full range of combinations of digits. A test with only one-digit numbers, or only even numbers, would not have good coverage of the content domain. Content related evidence typically involves subject matter experts (SME's) evaluating test items against the test specifications.

Criterion related evidence involves the correlation between the test and a criterion variable (or variables) taken as representative of the construct. For example, employee selection tests are often validated against meaures of job performance. Measures of risk of recidivism among those convicted of a crime can be validated against measures of recidivism. If the test data and criterion data are collected at the same time, this is referred to as concurrent validity evidence. If the test data is collected first in order to predict criterion data collected at a later point in time, then this is referred to as predictive validity evidence.

Face validity is an estimate of whether a test appears to measure a certain criterion; it does not guarantee that the test actually measures phenomena in that domain. Indeed, when a test is subject to faking (malingering), low face validity might make the test more valid.

In contrast to test validity, assessment of the validity of a research design generally does not involve data collection or statistical analysis but rather evaluation of the design in relation to the desired conclusion on the basis of prevailing standards and theory of research design.

Types of validity

Internal validity

Internal validity is an inductive estimate of the degree to which conclusions about causes of relations are likely to be true, in view of the measures used, the research setting, and the whole research design. Good experimental techniques in which the effect of an independent variable on a dependent variable is studied under highly controlled conditions, usually allow for higher degrees of internal validity than, for example, single-case designs.

External validity

The issue of External validity concerns the question to what extent one may safely generalize the (internally valid) causal inference (a) from the sample studied to the defined target population and (b) to other populations (i.e. across time and space).

Ecological validity

This issue is closely related to external validity and covers the question to which degree your experimental findings mirror what you can observe in the real world (ecology= science of interaction between organism and its environment). Ecological validity is whether the results can be applied to real life situations. Typically in science, you have two domains of research: Passive-observational and active-experimental. The purpose of experimental designs is to test causality, so that you can infer A causes B or B causes A. But sometimes, ethical and/or methological restrictions prevent you from conducting an experiment (e.g. how does isolation influence a child's cognitive functioning?) Then you can still do research, but it's not causal, it's correlational, A occurs together with B. Both techniques have their strengths and weaknesses. To get an experimental design you have to control for all interfering variables. That's why you conduct your experiment in a laboratory setting. While gaining internal validity (excluding interfering variables by keeping them constant) you lose ecological validity because you establish an artificial lab setting. On the other hand with observational research you can't control for interfering variables (low internal validity) but you can measure in the natural (ecological) environment, thus at the place where behavior occurs.

Construct validity

Construct validity refers to the totality of evidence about whether a particular operationalization of a construct adequately represents what is intended by theoretical account of the construct being measured. (Demonstrate an element is valid by relating it to another element that is supposively valid.) There are two approaches to construct validity- sometimes referred to as 'convergent validity' and 'divergent validity'.

Intentional validity

Intentional validity asks, "Do the constructs we chose adequately represent what we intend to study"? Constructs must be specific enough to distinguish. (Eg, is it intelligence or cunningness?)

Validity proves no bias

Representation validity or translation validity

Representation validity is concerned about how well the constructs or abstractions translate into observable measures. There are two primary questions to be answered.

  • Do the subconstructs properly define the construct (if you break up the main abstractions into smaller abstractions or definitions)?
  • Do the observations properly interpret, measure, or test the constructs?

One way to argue positively, albeit a very weak argument, is to claim face validity for the construct/observable relationship. Basically this is making the following claim: on the face of it, it seems like a good translation. The weakness of this argument can be strengthened by a consensus of experts. Another way to argue positively is to claim content validity for the construct/observable relationship. To do this one must check the operationalization against the relevant content domain for the construct: to extent to which the tests (ie, the observable measures) measure the content of the subject being tested — ie, that all the important content areas are covered adequately.

Content validity

This is a non-statistical type of validity that involves “the systematic examination of the test content to determine whether it covers a representative sample of the behaviour domain to be measured” (Anatasi & Urbina, 1997 p114).

A test has content validity built into it by careful selection of which items to include (Anatasi & Urbina, 1997). Items are chosen so that they comply with the test specification which is drawn up through a thorough examination of the subject domain. Foxcraft et al (2004, p. 49) note that by using a panel of experts to review the test specifications and the selection of items the content validity of a test can be improved. The experts will be able to review the items and comment on whether the items cover a representative sample of the behaviour domain.

Face validity

Face validity is very closely related to content validity. While content validity depends on a theoretical basis for assuming if a test is assessing all domains of a certain criterion (e.g. does assessing addition skills yield in a good measure for mathematical skills? - To answer this you have to know, what different kinds of arithmetic skills mathematical skills include ) face validity relates to whether a test appears to be a good measure or not. This judgment is made on the "face" of the test, thus it can also be judged by the amateur.

Criterion validity

Criterion-related validity reflects the success of measures used for prediction or estimation. There are two types of criterion-related validity: Concurrent and predictive validity. A good example of criterion-related validity is in the validation of employee selection tests; in this case scores on a test or battery of tests is correlated with employee performance scores.

Concurrent validity

Concurrent validity refers to the degree to which the operationalization correlates with other measures of the same construct that are measured at the same time. Going back to the selection test example, this would mean that the tests are administered to current employees and then correlated with their scores on performance reviews.

Predictive validity

Predictive validity refers to the degree to which the operationalization can predict (or correlate with) with other measures of the same construct that are measured at some time in the future. Again, with the selection test example, this would mean that the tests are administered to applicants, all applicants are hired, their performance is reviewed at a later time, and then their scores on the two measures are correlated.

Convergent validity

Convergent validity refers to the degree to which a measure is correlated with other measures that it is theoretically predicted to correlate with.

Discriminant validity

Discriminant validity describes the degree to which the operationalization does not correlate with other operationalizations that it theoretically should not correlated with.

Statistical conclusion validity

Factors jeopardizing validity

Campbell and Stanley (1963) define internal validity as the basic requirements for an experiment to be interpretable — did the experiment make a difference in this instance? External validity addresses the question of generalizability — to whom can we generalize this experiment's findings?

internal validity

Eight extraneous variables can interfere with internal validity:

1. History, the specific events occurring between the first and second measurements in addition to the experimental variables

2. Maturation, processes within the participants as a function of the passage of time (not specific to particular events), e.g., growing older, hungrier, more tired, and so on.

3. Testing, the effects of taking a test upon the scores of a second testing.

4. Instrumentation, changes in calibration of a measurement tool or changes in the observers or scorers may produce changes in the obtained measurements.

5. Statistical regression, operating where groups have been selected on the basis of their extreme scores.

6. Selection, biases resulting from differential selection of respondents for the comparison groups.

7. Experimental mortality, or differential loss of respondents from the comparison groups.

8. Selection-maturation interaction, etc. e.g., in multiple-group quasi-experimental designs

external validity

Four factors jeopardizing external validity or representativeness are:

9. Reactive or interaction effect of testing, a pretest might increase the scores on a posttest

10. Interaction effects of selection biases and the experimental variable.

11. Reactive effects of experimental arrangements, which would preclude generalization about the effect of the experimental variable upon persons being exposed to it in non-experimental settings

12. Multiple-treatment interference, where effects of earlier treatments are not erasable.

Internal Validity

Internal validity is the validity of (causal) inferences in scientific studies, usually based on experiments as experimental validity [1].

Contents

Details

Inferences are said to possess internal validity if a causal relation between two variables is properly demonstrated [2] [3]. A causal inference may be based on a relation when three criteria are satisfied:

  1. the "cause" precedes the "effect" in time (temporal precedence),
  2. the "cause" and the "effect" are related (covariation), and
  3. there are no plausible alternative explanations for the observed covariation (nonspuriousness) [4].

In scientific experimental settings, researchers often manipulate a variable (the independent variable) to see what effect it has on a second variable (the dependent variable)[5] For example, a researcher might, for different experimental groups, manipulate the dosage of a particular drug between groups to see what effect it has on health. In this example, the researcher wants to make a causal inference, namely, that different doses of the drug may be held responsible for observed changes or differences. When the researcher may confidently attribute the observed changes or differences in the dependent variable to the independent variable, and when he can rule out other explanations (or rival hypotheses), then his causal inference is said to be internally valid[6].

In many cases, however, the magnitude of effects found in the dependent variable may not just depend on

  • variations in the independent variable,
  • the power of the instruments and statistical procedures used to measure and detect the effects, and
  • the choice of statistical methods (see: Statistical conclusion validity).

Rather, a number of variables or circumstances uncontrolled for (or uncontrollable) may lead to additional or alternative explanations (a) for the effects found and/or (b) for the magnitude of the effects found. Internal validity, therefore, is more a matter of degree than of either-or, and that is exactly why research designs other than true experiments may also yield results with a high degree of Internal Validity.

In order to allow for inferences with a high degree of internal validity, precautions may be taken during the design of the scientific study. As a rule of thumb, conclusions based on correlations or associations may only allow for lesser degrees of internal validity than conclusions drawn on the basis of direct manipulation of the independent variable. And, when viewed only from the perspective of Internal Validity, highly controlled true experimental designs (i.e. with random selection, random assignment to either the control or experimental groups, reliable instruments, reliable manipulation processes, and safeguards against confounding factors) may be the "gold standard" of scientific research. By contrast, however, the very strategies employed to control these factors may also limit the generalizability or External Validity of the findings.

Threats to internal validity

Confounding

A major threat to the validity of causal inferences is Confounding: Changes in the dependent variable may rather be attributed to the existence or variations in the degree of a third variable which is related to the manipulated variable. Where Spurious relationships cannot be ruled out, rival hypothesis to the original causal inference hypothesis of the researcher may be developed.

Selection (bias)

Selection bias refers to the problem that, at pre-test, differences between groups exist that may interact with the independent variable and thus be 'responsible' for the observed outcome. Researchers and participants bring to the experiment a myriad of characteristics, some learned and others inherent. For example, sex, weight, hair eye, and skin color, personality, mental capabilities, and physical abilities, but also attitudes like motivation or willingness to participate.

During the selection step of the research study, if an unequal number of test subjects have similar subject-related variables there is a threat to the internal validity. For example, a researcher created two test groups, the experimental and the control groups. The subjects in both groups are not alike with regard to the independent variable but similar in one or more of the subject-related variables. It would be difficult for the researcher to determine if the discrepancy in the groups is due to the independent variable or to the subject-related variables. Selection bias may be reduced when selection/inclusion processes are controlled for and group assignment is randomized. However, in most cases, it may never be ruled out completely as relevant between-group differences may go unnoticed.

History

Events outside of the study/experiment or between repeated measures of the dependent variable may affect participants' responses to experimental procedures. Often, these are large scale events (natural disaster, political change, etc) that affect participants' attitudes and behaviors such that it becomes impossible to determine whether any change on the dependent measures is due to the independent variable, or the historical event.

Maturation

Subjects change during the course of the experiment or even between measurements. For example, young children might mature and their ability to concentrate may change as the grow up. Both permanent changes, such as physical growth and temporary ones like fatigue, provide "natural" alternative explanations; thus, they may change the way a subject would react to the dependent variable. So upon completion of the study, the researcher may not be able to determine if the cause of the discrepancy is due to time or the independent variable.

Repeated testing

Repeatedly measuring the participants may lead to bias. Participants may remember the correct answers or may be conditioned to know that they are being tested. Repeately taking (the same or similar) intelligence tests usually leads to score gains, but instead of concluding that the underlying skills have changed for good, this threat to Internal Validity provides good rival hypotheses.

Instrument change

The instrument used during the testing process can change the experiment. This also refers to observers being more concentrated or primed. If any instrumentation changes occur, the internal validity of the main conclusion is affected, as alternative explanations are readily available.

Regression toward the mean

This type of error occurs when subjects are selected on the basis of extremes score (one far away from the mean) during the first test but score closer to the mean with the second test. For example, when in 3rd grade children with the worst reading skills are selected to participate in a nationwide reading skills program, may it be validly concluded that positive changes at the end of the program are due to the educational efforts? A good alternative conclusion is regression toward the mean and occurs due to a regression artifact. If the children had been tested again before the start of the program, they would likely have obtained less extreme scores, even if repeated testing effects could have been ruled out.

Mortality/differential attrition

This error occurs if inferences are made on the basis of only those participants that have participated from the start to the end. However, participants may have dropped out of the study before completion, and maybe even due to the study or programme or experiment itself. For example, the percentage of group members having quit smoking at post-test was found much higher in a group having received a quit-smoking training program than in the control group. However, in the experimental group only 60% have completed the program. If this attrition is systematically related to any feature of the study, the administration of the independent variable, the instrumentation, or if dropping out leads to relevant bias between groups, a whole class of alternative explanations is possible that account for the observed differences.

Selection-maturation interaction

This occurs when the subject-related variables, color of hair, skin color, etc., and the time-related variables, age, physical size, etc., interact. In the fruit experiment, the ages of the children in one school is 4-12 and the other 4-9 years old. If a discrepancy between the two groups occurs between the testing, the discrepancy may be due to the age differences in the age categories.

Diffusion

If treatment effects spread from treatment groups to control groups, a lack of differences between experimental and control groups may be observed. This does not mean, however, that the independent variable has no effect or that there is no relationship between dependent and independent variable.

Compensatory rivalry/resentful demoralization

Behaviour in the control groups may alter as a result of the study. For example, control group members may work extra hard to see that expected superiority of the experimental group is not demonstrated. Again, this does not mean, that the independent variable produced no effect or that there is no relationship between dependent and independent variable. Vice-versa, changes in the dependent variable may only be effected due to a demoralized control group, working less hard or motivated, not due to the independent variable.

Experimenter bias

Experimenter bias occurs when the individuals who are conducting an experiment inadvertently affect the outcome by non-consciously behaving differently to members of control and experimental groups. It is possible to eliminate the possibility of experimenter bias through the use of double blind study designs, in which the experimenter is not aware of the condition to which a participant belongs.

References

  1. Mitchell, M. and Jolley, J. (2001). Research Design Explained (4th Ed) New York:Harcourt.
  2. Brewer, M. (2000). Research Design and Issues of Validity. In Reis, H. and Judd, C. (eds.) Handbook of Research Methods in Social and Personality Psychology. Cambridge:Cambridge University Press.
  3. Shadish, W., Cook, T., and Campbell, D. (2002). Experimental and Quasi-Experimental Designs for Generilized Causal Inference Boston:Houghton Mifflin.
  4. ibid.
  5. Levine, G. and Parkinson, S. (1994). Experimental Methods in Psychology. Hillsdale, NJ:Lawrence Erlbaum.
  6. Liebert, R. M. & Liebert, L. L. (1995). Science and behavior: An introduction to methods of psychological research. Englewood Cliffs, NJ: Prentice Hall.

External Validity

External validity is the validity of generalized (causal) inferences in scientific studies, usually based on experiments as experimental validity.[1]

Inferences about cause-effect relationships based on a specific scientific study are said to possess external validity if they may be generalized from the unique and idiosyncratic settings, procedures and participants to other populations and conditions[2][3] Causal inferences said to possess high degress of external validity can reasonably be expected to apply (a) to the target population of the study (i.e. from which the sample was drawn) (also referred to as population validity), and (b) to the universe of other populations (e.g. across time and space).

The most common loss of external validity comes from the fact that experiments using human participants often employ small samples obtained from a single geographic location or with idiosyncratic features (e.g. volunteers). Because of this, one can not be sure that the conclusions drawn about cause-effect-relationships do actually apply to people in other geographic locations or without these features.

Contents

Threats to external validity

"A threat to external validity is an explanation of how you might be wrong in making a generalization."[4] Generally, generalizability is limited when the cause (i.e. the independent variable) depends on other factors; therefore, all threats to external validity interact with the independent variable.

  • Aptitude-Treatment-Interaction: The sample may have certain features that may interact with the independent variable, limiting generalizability. For example, inferences based on comparative psychotherapy studies often employ specific samples (e.g. volunteers, highly depressed, no comorbidity). If psychotherapy is found effective for these sample patients, will it also be effective for non-volunteers or the mildly depressed or patients with concurrent other disorders?
  • Situation: All situational specifics (e.g. treatment conditions, time, location, lighting, noise, treatment administration, investigator, timing, scope and extent of measurement, etc. etc.) of a study potentially limit generalizability.
  • Pre-Test Effects: If cause-effect relationships can only be found when pre-tests are carried out, then this also limits the generality of the findings.
  • Post-Test Effects: If cause-effect relationships can only be found when post-tests are carried out, then this also limits the generality of the findings.
  • Reactivity (Placebo, Novelty, and Hawthorne Effects): If cause-effect relationships are found they might not be generalizable to other settings or situations if the effects found only occurred as an effect of studying the situation.
  • Rosenthal Effects: Inferences about cause-consequence relationships may not be generalizable to other investigators or researchers.

External, internal, and ecological validity

In many studies and research designs, there may be a "trade-off" between internal validity and external validity: When measures are taken or procedures implemented aiming at increasing the chance for higher degrees of internal validity, these measures may also limit the generalizability of the findings. This situation has led many researchers call for "ecologically valid" experiments. By that they mean that experimental procedures should resemble "real-world" conditions. They criticize the lack of ecological validity in many laboratory-based studies with a focus on artificially controlled and constricted environments. External validity and ecological validity are closely related in the sense that causal inferences based on ecologically valid research designs often allow for higher degrees of generalizability than those obtained in an artificially produced lab environment. However, this is not always the case: Some findings produced in ecologically valid research settings may hardly be generalizable, and some findings produced in highly controlled settings may claim near-universal external validity. Thus, External and Ecological Validity are independent - a study may possess external validity but not ecological validity, and vice-versa.

Qualitative research

Within the qualitative research paradigm, external validity is replaced by the concept of transferability. Transferability is the ability of research results to transfer to situations with similar parameters, populations and characteristics.[5]

  1. Mitchell, M. & Jolley, J. (2001). Research Design Explained (4th Ed) New York:Harcourt.
  2. Brewer, M. (2000). Research Design and Issues of Validity. In Reis, H. & Judd, C. (eds) Handbook of Research Methods in Social and Personality Psychology. Cambridge:Cambridge University Press.
  3. Shadish, W., Cook, T., & Campbell, D. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference Boston:Houghton Mifflin.
  4. Trochim, William M. The Research Methods Knowledge Base, 2nd Edition.
  5. Lincoln, Y.S. & Guba, E.G. (1986). But is it rigorous? Trustworthiness and authenticity in naturalistic evaluation. In D.D. Williams (Ed.), Naturalistic evaluation (pp. 73-84). New Directions for Program Evaluation, 30. San Francisco, CA: Jossey-Bass.

Construct Validity

In social science and psychometrics, construct validity refers to whether a scale measures or correlates with a theorized psychological construct (such as "fluid intelligence"). It is related to the theoretical ideas behind the personality trait under consideration; a non-existent concept in the physical sense may be suggested as a method of organising how personality can be viewed.[1] The unobservable idea of a unidimensional easier-to-harder dimension must be "constructed" in the words of human language and graphics.

A construct is not restricted to one set of observable indicators or attributes. It is common to a number of sets of indicators. Thus, "construct validity" can be evaluated by statistical methods that show whether or not a common factor can be shown to exist underlying several measurements using different observable indicators. This view of a construct rejects the operationist past that a construct is neither more nor less than the operations used to measure it.

Evaluation of construct validity requires examining the correlation of the measure being evaluated with variables that are known to be related to the construct purportedly measured by the instrument being evaluated or for which there are theoretical grounds for expecting it to be related. Such is consistent with the multitrait-multimethod matrix of examining construct validity described in Campbell & Fiske's landmark paper (1959). Correlations that fit the expected pattern contribute evidence of construct validity. Construct validity is a judgment based on the accumulation of correlations from numerous studies using the instrument being evaluated.

There are variants of construct validity:

References

  1. Pennington, Donald (2003). Essential Personality. Arnold, p.37. ISBN 0340761180.

Content Validity

In psychometrics, content validity (also known as logical validity) refers to the extent to which a measure represents all facets of a given social concept. For example, a depression scale may lack content validity if it only assesses the affective dimension of depression but fails to take into account the behavioral dimension. An element of subjectivity exists in relation to determining content validity, which requires a degree of agreement about what a particular personality trait such as extraversion represents. A disagreement about a personality trait will prevent the gain of a high content validity.[1]

Content validity is related to face validity, though content validity should not be confused with face validity. The latter is not validity in the technical sense; it refers, not to what the test actually measures, but to what it appears superficially to measure. Face validity pertains to whether the test "looks valid" to the examinees who take it, the administrative personnel who decide on its use, and other technically untrained observers. Content validity requires more rigorous statistical tests than face validity, which only requires an intuitive judgement. Content validity is most often addressed in academic and vocational testing, where test items need to reflect the knowledge actually required for a given topic area (e.g., history) or job skill (e.g., accounting). In clinical settings, content validity refers to the correspondence between test items and the symptom content of a syndrome.

One widely used method of measuring content validity was developed by C. H. Lawshe. It is essentially a method for gauging agreement among raters or judges regarding how essential a particular item is. Lawshe (1975) proposed that each raters on the judging panel respond to the following question for each item: "Is the skill or knowledge measured by this item essential/useful but not essential/ not necessary to the performance of the construct?" According to Lawshe, if more than half the panelists indicate that an item is essential, that item has at least some content validity. Greater level of content validity exist as larger numbers of panelists agree that a particular item is essential. Using these assumptions, Lawshe developed a formula termed the content validity ratio:

                            CVR = (ne - N/2)/(N/2)

CVR= content validity ratio, ne= number of panelists indicating "essential", N= total number of panelists. And the minimum values of the CVR to ensure that agreement is unlikely to be due to chance can be found in the following table:

Number of Panelists

Minimum Value

5

.99

6

.99

7

.99

8

.75

9

.78

10

.62

11

.59

12

.56

13

.54

14

.51

15

.49

20

.42

25

.37

30

.33

35

.31

40

.29

References

  1. Pennington, Donald (2003). Essential Personality. Arnold, p.37. ISBN 0340761180.

Ecological Validity

Ecological validity is a form of validity in a research study. For a research study to possess ecological validity, the methods, materials and setting of the study must approximate the real-life situation that is under investigation.[1] Unlike internal and external validity, ecological validity is not necessary to the overall validity of a study.[2]

External vs. ecological validity

Ecological validity is often confused with external validity (which deals with the ability of a study's results to generalize). While these forms of validity are closely related, they are independent--a study may possess external validity but not ecological validity, and vice-versa .[1][2] For example, mock-jury research is designed to study how people might act if they were jurors during a trial, but many mock-jury studies simply provide written transcripts or summaries of trials, and do so in classroom or office settings. Such experiments do not approximate the actual look, feel and procedure of a real courtroom trial, and therefore lack ecological validity. However, the more important concern is that of external validity--if the results from such mock-jury studies generalize to real trials, then the research is valid as a whole, despite its ecological shortcomings. Nonetheless, improving the ecological validity of an experiment typically improves the external validity as well.

References

  1. Brewer, M. (2000). Research Design and Issues of Validity. In Reis, H. and Judd, C. (eds) Handbook of Research Methods in Social and Personality Psychology. Cambridge:Cambridge University Press.
  2. Shadish, W., Cook, T., and Campbell, D. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference Boston:Houghton Mifflin.

No comments: