Module 1 - Introduction To Psychometrics
Module 1 - Introduction To Psychometrics
Given this, I could make (and have made; see Bandalos & Kopp, 2012) the argument that some
knowledge of the theory and practice of testing is essential for an informed citizenry.
Bangladesh Psychometric Society (BPS) 2
Preface…
Unfortunately, even in professions such as teaching, in which knowledge of
testing is clearly essential, the majority of training programs provide little
instruction in this area (Plake, Impara, & Fagan, 1993; Schaefer & Lissitz,
1987; Wise & Conoley, 1993). Aiken, West, and Millsap (2008) reported that
fewer than half of the professors teaching in graduate-level psychology
departments felt their students had adequate knowledge of even the most basic
measurement concepts.
How What
Psychometrics
When Who
Where
4
Contents
1. Common Terminologies
2. What is Psychometrics?
3. Why Psychometrics Matters?
4. Observable Behavior and Unobservable Psychological Attributes
5. Assumptions in Psychological Testing and Assessment
6. Definition of Psychological Tests
7. Types of Psychological Tests
8. Principles of Psychological Testing
9. Who Uses Psychological Tests and for What Purposes
10. Why Control the Use of Psychological Tests?
11. Challenges to Measurement in Psychology
12. Psychometrics as a Profession
13. Guidelines of Evaluating a Psychological Test
5
Common Terminologies
Psychometry
The measurement of psychological characteristics.
Psychometrics
The science of Psychometry, i.e., evaluating the characteristics of tests
designed to measure psychological attributes of people.
Psychometrists
Persons trained in using psychometric tools under the guidance of a Psychologist
or Psychometrician
Psychometricians
Persons trained in measurement theory aimed toward psychological
measurement; they propose and evaluate methods for developing new tests and
other measurement instruments.
Psychological Assessment
Measurement
Psychological Test and Testing
Survey 6
What is Psychometrics?
The branch of psychology that deals with the design, administration, and interpretation of
quantitative tests for the measurement of psychological variables such as intelligence,
aptitude and personality traits.
Psychometrics is the study of the operations and procedures used to measure variability
in behavior and to connect those measurements to psychological phenomena.
It is the science concerned with evaluating attributes of psychological tests.
It has 3 attributes: (1) type of information generated by the psychological tests, (2) the
reliability of data from psychological tests and (3) issues concerning the validity of data
obtained from the tests.
11
Why Psychometrics Matters?
1. Whether you wish to be a practitioner of behavioral science, a behavioral researcher or a
sophisticated member of modern society, your life is likely to be affected by psychological
measurement.
4. Without a solid understanding of the basic principles of psychological measurement, test users
risk misinterpreting or misusing information.
12
Why Psychometrics Matters?...
5. Such misinterpretation or misuse might harm patients, students, clients, employees, and
applicants, and it can lead to lawsuits for the test user.
6. Proper test interpretation and use can be extremely valuable for test users and beneficial for test
takers.
7. Whether your area is psychology, education or other behavioral sciences, measurement is at the
heart of your research process.
8. If something is not measured well, it cannot be studied with any scientific validity.
9. If you wish to interpret your research findings in a meaningful and accurate manner, then you
must evaluate critically the data.
10. Given the widespread use and importance of psychological measurement, it is crucial to
understand the process, properties, and other aspects of psychological measurements.
13
Observable Behavior and Unobservable Psychological Attributes
• In the behavioral sciences, these observable events are typically some kind of
behavior.
• Psychologists measure a behavior because they are interested in that specific behavior
in its own right.
• For example, some psychologists have studied the way facial expressions affect the
perception of emotions.
• In such cases, they identify some type of observable behavior that represents the
particular unobservable psychological attribute, state or process.
• They measure the behavior and try to interpret those measurements in terms of the
unobservable psychological characteristics reflected in the behavior.
• For example, identifying greater working memory status between two persons.
Second, for interpretation, the recall task had to be theoretically linked to the
unobservable mental attribute.
20
Psychological Assessment, Psychological
Tests, Measurements, and Surveys…
21
Psychological Assessment, Psychological
Tests, Measurements, and Surveys…
22
Psychological Assessment, Psychological
Tests, Measurements, and Surveys…
25
Types of Psychological Tests
Intelligence
Cognitive
Dimension
Aptitude
Achievement
1. Behavioural Psychomotor
Dimension Dimension
Interest
Values
Affective
Dimension
Attitude
Personality 27
4. Mode of Administration
Individual Testing 5.
Mode of Scoring
Group Testing
Paper-Pencil Testing
6. 7.
Criterion of Mode of
Scoring Interpretation
Norm- Criterion-
Objective Subjective Referenced Referenced
Interpretation Interpretation
28
Speed Test
9. Rate of Response
Power Test
29
Comparison of Criterion-Referenced
and Norm-Referenced Measures
Although criterion-referenced and norm-referenced measures are developed so that scores will be
interpreted differently, they have characteristics in common as well as distinct differences. Common
characteristics include the following according to Gronlund (1988, p. 4).
30
Comparison of Criterion-Referenced
and Norm-Referenced Measures…
Criterion-Referenced Measures Norm-Referenced Measures
Interpretation Absolute interpretation. Amount of Relative interpretation. Amount of
attribute measured is specified based attribute measured is compared to
Type of
31
Comparison of Criterion-Referenced
and Norm-Referenced Measures…
Criterion-Referenced Measures Norm-Referenced Measures
Purpose of
Testing
To assess the amount of an attribute or To spread out objects or persons
material known by each in isolation of across a continuum on the
others. attribute measured.
Distribution
if testing
As used in this text, the term ―standardized‖ is applied to measures that have four essential
characteristics:
1. A fixed set of items or operations designed to measure a clearly defined concept,
attribute, or behavior;
2. Explicit rules and procedures for administration and scoring;
3. Provision of norms to assist in interpreting scores;
4. An ongoing development process that involves careful testing, analysis, and revision in
order to assure high technical quality.
34
Standardized Versus Non-standardized Measures…
Standardized Measures Non-standardized Measures
Involves input of experts; method of construction is designed Advantage May be carried out in situations in which
to enhance technical quality, reliability, and validity; time and resources are limited; short span of time is
Construction
procedure used in construction and testing is usually required between planning and use of the measure.
described.
Costly, time-consuming; requires adequate resources. Disadvantage Construction procedure is variable and
does not necessarily assure high quality; procedure
generally is not described in detail; amount of expert
input is variable and may be unknown.
Measures attributes or behaviors that are common to a Advantage Well-adapted to specialized needs and emphasis;
variety of settings and situations; is applicable to many flexibility allows adaptation to changes in materials
settings; reflects widespread consensus rather than localized or procedures; allows inclusion of controversial or
emphasis; is applicable across time and locale; is well timely information.
Content
35
Standardized Versus Non-standardized Measures…
Standardized Measures Non-standardized Measures
Reliability (internal consistency and test-retest) is high, Advantage Technical properties to be optimized are determined
yielding stable results; procedures to establish reliability and based on purposes of the measure (e.g., qualitative
Psychometrics
validity are reported, so are known to the user; items and studies).
operations have high discriminating power.
Stability of scores results in insensitivity to minor Disadvantage Technical properties frequently are unknown and
fluctuations that may be desirable to measure. may be highly variable, dependent on the
construction procedures used.
Administration and Scoring
Established procedures provide consistency, giving Advantage Procedures can be developed based on specific
comparable results; effects of different testing conditions and needs and resources; flexible procedures permit last-
environments are minimized; centralized or automated minute alterations; local and/or hand scoring is cost-
scoring is cost-efficient for large-scale efforts. efficient for small samples; time lag between
administration and scoring is determined by the user.
Inflexibility precludes altering to fit individual circumstances Disadvantage Consistency between different
and resources; may be costly and time-consuming; administrations of the same measure is variable;
scheduling of administration and return of scored results may different rules may be applied in scoring, thus
be controlled externally. yielding incomparable results.
36
Standardized Versus Non-standardized Measures…
Standardized Measures Non-standardized Measures
Scores can be uniformly compared with norm Advantage Comparisons and interpretations can be
groups, often at the national level; geared to specific needs and unique
Interpretation of Scores
38
Questions to Guide Evaluation of Standardized Measures
1. Purpose: Are the stated purpose and recommended uses for the measure congruent with the purpose for which it will be
employed? Will the measure yield the desired information?
2. Conceptual basis: Is the theoretical model that guided the development of the measure identical to (or, at the very least,
compatible with) the model being employed? What are the assumptions and potential biases underlying the measure?
Are the values inherent in the development of the measure congruent with those that are to be maximized in the current
situation?
3. Content: Is the content of the measure appropriate without modification for the use intended? Is it up to date? Is the
content appropriate for the ages, reading abilities, and frames of reference of potential subjects?
4. Technical quality: What types of reliability and validity have been established? What is the nature of evidence
supporting the reliability and validity of the measure? How was the measure developed and tested? What were the
qualifications of the individuals involved?
5. Norms: How were norms established? How was the referent group selected and what are its characteristics? Are the
norms appropriate and sufficiently detailed for use as a basis of comparison? Are the norms clear and easily
interpretable? Are they up to date? 39
Questions to Guide Evaluation of Standardized Measures
6. Administration: Are clear and explicit instructions provided for administration? What resources are required
for administration? How easy, costly, and time-consuming is the administration? Is training required? What
about subject burden?
7. Scoring: Is the measure hand- or machine-scored? What costs or special equipment are required? How likely
are errors to occur in scoring? What is the time required for scoring?
8. Interpretation: Can scores be easily and consistently interpreted? Are materials provided to aid in
interpretation?
9. Cost: What is the cost for employing the measure, including purchase, administration, and scoring costs?
What is the cost (if any) to subjects? Are the costs proportional to the relative importance of the information
that will be obtained?
10.Critical reviews: What are the evaluations provided by others who have used the measure? What problems,
strengths, and weaknesses have been identified?
40
Stages in the Development and Validation
of Criterion-Referenced Measures
1. Specify the conceptual model of the measure.
2. Specify the purpose(s) of the measure.
3. Explicate objective(s) or the domain definition.
4. Prepare test specifications including:
a. Method of administration
b. Number or proportion of items that will focus on each objective or subscale
c. Type of items and how they will be created
d. Test restrictions and givens
e. General scoring rules and procedures
5. Construct the measure including:
a. Develop a pool of items or tasks matched to the objective(s) or subscales
b. Review items or tasks to determine content validity and their appropriateness
c. Select items after editing or deleting poorly developed items from the item pool
d. Assemble the measure (including preparation of directions, scoring keys, answer sheets, etc.)
6. Set standards or cut score for interpreting results.
7. Field-test or administer the measure.
8. Assess reliability and validity of measure (including determining the statistical properties of items,
and deleting and revising items further based on empirical data).
41
Principles of Psychological Testing
By principles of psychological testing we mean the basic concepts and fundamental
ideas that underlie all psychological and educational tests.
Reliability
• Reliability refers to the accuracy, dependability, consistency, or repeatability of test
results.
• In more technical terms, reliability refers to the degree to which test sores are free of
measurement errors.
Validity
• Validity refers to the degree to which a certain inference or interpretation based on a
test is appropriate.
• Another principle of psychological testing concerns how a test is created or
constructed.
42
Names and Web Addresses of Test Publishers
Publisher Website Popular Published Tests
Educational www.ets.org Graduate Record Examination General Test and Subject Tests
Testing Service SAT Reasoning Test and SAT Subject Tests
Test of English as a Foreign Language
Graduate Management Admission Test
47
Who Uses Psychological Tests and for What Purposes…
Clinical Settings
48
Who Uses Psychological Tests and for What Purposes…
Organizational Settings
49
Why Control the Use of Psychological Tests?
There are two principle reasons for controlling the use of psychological tests:
1. To ensure that the test is given by a qualified examiner and that the scores are
properly used;
2. To prevent general familiarity with the test content, which would invalidate the test.
Qualified Examiner:
The need for qualified examiner is evident in each of the 3 major aspects of the testing
situation:
3. Interpretation of scores
50
Why Control the Use of Psychological Tests?...
51
Why Control the Use of Psychological Tests?
Security of Test Content and Communication of Test Information
• Test content clearly has to be restricted in order to forestall deliberate efforts to
fake scores.
• Ensuring the security of specific test content need not-and should not- interfere
with the effective communication of testing information to takers, concerned
professionals, and the general public.
• Such communication serves several purposes as follows:
• The measures used in the behavioral sciences tend to differ from those used by
physical scientists is 3rd important aspect.
• 4th challenge is score sensitivity—the psychological process may not sensitive
enough to discriminate between real difference.
• A final challenge is an apparent lack of awareness of psychometric information.
54
Psychometrics as a Profession
Psychometricians can find the scope of work in a wide variety of fields and job environments
including—
Hospitals and mental clinics
Educational settings (school, universities…)
Large corporations like software development companies
Market research, employee selection & training, performance analysis (industrial-
organizational psychology)
Consultant
Policy Advocacy
The work of Dr. Kevin McGrew, a Psychometrician presents a nice precedent for career in
this field. He is a school psychologist, an educational psychologist as well as the director and
founder of the Institute for Applied Psychometrics (IAP).
55
Bangladesh Psychometric Society (BPS) 56
Guidelines for Critiquing a Psychological Test
To make informed decisions about tests, one needs to know how to critique a test properly. A critique of a test
is an analysis of the test. A good critique answers many of the following questions. Not all questions can be
answered for all tests, as some questions are not relevant to all tests.