
In psychometric terms,
validity is a concept that has gone through a long evolutionary process. At the
beginning, Muniz (1996) works the validity with a specific position.
In statistical terms,
validity is defined as the true proportion of the variance that is relevant to
the test penalties. With the term 'relevant' we refer to what is attributable
to the variable, the characteristics that the test measures. But, do we know that
there are several types of validity?
In this sense,
generally the validity of a test is defined either by means of:
●
The relationship between its proportions with
some measure of external criteria, or,
● The extent to which
the test measures a hypothesized specific underlying trait or “construct”.
Validity in psychometric terms
In psychometric terms,
validity is a concept that has gone through a long evolutionary process. At the
beginning, Muñiz (1996) worked the validity with a specific position. This held
that "a test is valid for that with which it correlates."
Now, validity is
understood as a global evaluative judgment. In this judgment, the empirical
evidence and the theoretical assumptions support the sufficiency and
appropriateness of the interpretations not only of the items, but also of the
way people respond as well as the context of the evaluation.
So what is validated
is not the proof. What is validated in particular are the inferences made from
it. This has two consequences:
●
The person responsible for the validity of a
test is no longer only its constructor, but also the user.
● The validity of a test
is not established once and for all. It is the result of the collection of
evidence and theoretical assumptions that occur in an evolutionary and continuous
process. This includes all the experimental, statistical and philosophical
questions through which scientific hypotheses and theories are evaluated.
In this context, the
concept of validity refers to the appropriateness, meaning, and usefulness of
specific inferences made from test scores. Test validation is the process of
accumulating evidence to support such inferences. Thus, validity is a unitary
process. Although evidence can be accumulated in many ways, validity always
refers to the degree to which that evidence supports the inferences made from
the scores.
Types of evidence
In 1954, a committee
chaired by L. J. Cronbach established on behalf of the American Psychological
Association (APA) that validity was of four types. These are:
● Content validity.
● predictive validity.
● concurrent validity.
●
Construct validity.
Currently, it is agreed, from a
scientific point of view, that the only admissible validity is construct
validity (Messick, 1995).
Validity and its aspects
Within the study of validity, the
evidence is related to five aspects:
● The content (the relevance and
representativeness of the test).
● The noun (the theoretical reasons
for the observed consistency of the responses).
● Structural (internal configuration
of the test and dimensionality).
● Generalization (the degree to which
inferences made from the test can be generalized to other populations,
situations, or tasks).
● External (relationships of the test
with other tests and constructs).
●
Consequence (ethical and social consequences of the test).
Thus, within this
validity we can understand other types of validity or strategies. As we have
previously mentioned, these are content validity, predictive validity,
concurrent validity, and construct validity.
Types of validity: content validity
In this type of
validity, the following question is answered. Are the items that make up the
test really a representative sample of the content domain or behavioral domain
that interests us?
So that we understand
each other, a behavioral domain or field is a hypothetical grouping of all
possible items that cover a particular psychological area. For example, a
vocabulary test should be an adequate sample of possible item mastery in this
area.
In this sense, content
validity is a "measure" of the adequacy of sampling. We say “measure”
in quotation marks, since this type of validity consists of a series of
estimates or opinions. These estimates do not provide a quantitative index of
validity.
This type of validity
is associated above all with achievement tests (mathematics tests, history…).
For its determination, the test questions are systematically compared with the
behavioral domain of the postulated content.
For example, we have a
list of 500 words that we expect students in a course to be able to spell
correctly. Then, their performance regarding these words will be important
exclusively to test the student's ability to spell all 500 words correctly.
However, it will only have content validity to the extent that it provides an
adequate sample of the 500 words it represents.