Introduction to Psychological
Testing & Assessment
PRAGYA LODHA, MA, PSY.D (HON.)
VISITING FACULTY, EARLY CAREER MENTAL HEALTH RESEARCHER
Outline
What is a psychological test?
Difference between psychological testing & psychological
assessment
What are the types of psychological tests?
What are the uses of psychological tests?
Characteristics of a good psychological test
What is a psychological test?
an objective & standardized measure of a sample of a sample of behaviour-
Anastasi
process of measuring psychology-related variables by means of devices or
procedures designed to obtain a sample of behavior- McGraw
standardized instrument designed to measure objectively one or more aspects of
human personality by means of samples of verbal or non-verbal responses or
behaviour - Freeman
What is psychological assessment?
the gathering and integration of psychology - related data for the purpose of
making a psychological evaluation that is accomplished through the use of tools
such as tests, interviews, case studies, behavioral observation, and specially
designed apparatuses and measurement procedures- McGraw
Difference between test & assessment
Testing Assessment
It is a part of assessment A more comprehensive term
Can be one or more tools but separate Compilation of one or more tools
Mostly provide one score or narrow Provides overall interpretation of the
understanding of the individual individual
Can be used without special training in Requires special training in assessment
assessment
Includes different kinds of methods such
Is one kind of assessment method as interview, test, observation
The main purpose is to test one kind of Usually has a referral question or purpose
behavior for doing assessment
Example: DAT, 16PF, TAT, WISC Example: Assessment of learning disability
Types of tests
Purpose of the Administration Format of the Response of Response of
test of the test test the test the test
Intelligence Paper-pencil
Group test Verbal test Speed test
test test
Performance Non-Verbal
Personality test Individual test Power test
test test
Aptitude test
Interest test
Types of tests used in practice
Personality tests
Intelligence tests / Cognitive and Aptitude tests (latent traits)
-Job Knowledge and achievement tests (logical / mechanical / numerical)
Career counselling tests
Rating scales
Neuropsychological tests
Psychological tests are used for analyzing,
describing and evaluating individuals to
predict and guide their behaviour.
What are the uses of psychological tests?
Clinical setting
School setting
Corporate setting
Vocational setting
Military setting
Research setting
What are the uses of psychological tests?
Clinical setting: maladjustment, psychopathology, screening
School setting: learning difficulty, emotional-behavioural problems, merit, failures
Corporate setting: selection, eligibility, screening
Vocational setting: aptitude, interest, personality characteristics
Military setting: selection, maladjustment, screening for fitness
Research setting: data collection, identifying patterns, analysis
Uses of tests in counselling
Diagnostic utility
Screening for psychopathology / maladjustment
Treatment planning
Understanding of patterns (emotional / behavioural)
Understanding of emotional and personality organisation
Prognosis & progress evaluation- pre & post test
Characteristics of a good psychological test
Standardization
Objectivity
Sampling
Reliability
Validity
Standardisation
uniformity of procedure not only in administering & scoring of the test but also in
interpreting the test results
most important steps in the development of a psychological test
standardized test are those tests that have clearly specified procedures for
administration, scoring and interpretation in addition to norms.
is open to use for larger population
Standardisation is achieved in many ways
1) Establishment of norms- in this process the test is conducted on a group of people to see the scores
which are typically obtained. These scores become the standard / typical scores. In this way, an
appropriate test taker can make sense of his or her score by comparing it to standard / typical scores.
These norms help the examiner in interpreting test results.
2) The test constructor provides detailed information regarding test. In this, he / she describes the exact
material to be used while administering a test, time limits, oral instructions, preliminary demonstrations,
ways of handling queries from the examiner & the testing conditions under which the test should be
administered.
3) Scoring procedure which is thoroughly explained in the test manual- the test constructor provides
detailed guidelines with examples about scoring tests. He sees to it that the biases on the part of the
examiner that might affect the results are eliminated , the scoring is made more objective &
quantifiable.
Objectivity
purpose of standardizing a test & giving it uniformity in administration scoring &
interpretation is to make a psychological test as objective as possible
no test is purely objective in practice
objectivity is the goal of the test construction & is been achieved to a considerable
extent in most tests
achieved by: establishing uniformity in testing situations which includes factors like time
limit, instructions & materials to be used & preliminary demonstrations as well as
controlling for the physical environment of texting(lighting, noise, ventilation etc)
Objectivity is the degree to which equally
competent scores obtain the same
results- free of personal opinion & bias
judgement.
Gronlund & Linn (1995)
Sampling
process of selecting the portion of the universe deemed to be representative of the whole
population
every test is designed to be used by a certain segment of the population
also determines for which group a test to be used
Validity
test measures what it purports to measure / the test measures what it is supposed to measure
concerns what the test measures and how well it does
Types:
1. Face validity: it is not valid in a technical sense. Face validity is the extent to which items on test
appear to be meaningful and relevant
2. Content validity: the extent to which the content of the test provides an adequate representative of
the conceptual domain it is designed to cover. It involves essentially the systematic examination of
the test content to determine whether it covers a representative sample of the behavior domain to be
measured.
Reliability
a measure of the test’s consistency
also can be a measure of a test’s internal consistency; useful test is consistent over time
Types:
1. Test – retest reliability: consistency of scores obtained by the same person when retested with the
same test on different occasions, or with different sets of equivalent items or under variable
examining conditions
2. Split-half reliability: it is also called odd even reliability. Split-half reliability is so called because
the reliability of a test is determined by splitting the test into two equal halves & then determining
the coefficient of reliability by correlating the scores on each half of the test.
Can a test be valid and not
reliable?
Can a test be reliable and not
valid?
PRECISION versus ACCURACY
Cost & time Cultural
efficiency fulfillment
Effectiveness
Classical Test Theory (CTT)
underlying theoretical framework that underpins conventional psychometric testing
Observed score (X) = True Score (T) + Error (E)
ensure reliability, precision, and accuracy of psychometric test scores by minimizing error
error is estimated using reliability coefficients, particularly test-retest reliability and internal
consistency
higher levels of reliability generally indicate lower levels of error, and thus greater
congruence between the true score and the observed score
CTT
Difficulty index: a measure of individual test item difficulty. It is calculated by finding proportion of
students who answered correctly out of the total number of students. 𝐷𝑖𝑓𝑓𝑐𝑖𝑢𝑙𝑡𝑦 𝑙𝑒𝑣𝑒𝑙 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓
𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑠 / 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠. Greater the number of students attempting an item
correctly, lesser is the difficulty level of items.
Discrimination index: a measure of the effectiveness of an item in discriminating between high
and low ability students on a test. The notion is that high-ability students will tend to choose the
right answer, on the other hand low-ability students will tend to choose the wrong answer. High
ability or low ability is purely based on performance on the test. 𝐷𝑖𝑠𝑐𝑟𝑖𝑚𝑖𝑛𝑎𝑡𝑖𝑜𝑛 𝑖𝑛𝑑𝑒𝑥 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓
𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑠 𝑖𝑛 ℎ𝑖𝑔ℎ𝑒𝑟 𝑔𝑟𝑜𝑢𝑝−𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒𝑠 𝑖𝑛 𝑙𝑜𝑤𝑒𝑟 𝑔𝑟𝑜𝑢𝑝 / 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓
𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑖𝑛 𝑒𝑎𝑐ℎ 𝑔𝑟𝑜𝑢𝑝
Distractor analysis: conducted to evaluate the efficiency of each distractor for a multiple-choice
question. Ideally, all distractors should be equally plausible to students who do not know the right
answer. It is advisable to eliminate distractors that are never chosen which means they are not
working.
Limitations
small sample size
estimates on difficulty level and discrimination index are sample dependent, and this
dependency reduces their utility
ability cannot be judged just based on the number of items answered correctly, rather the
item attribute, such as its difficulty level, should also be taken into account.
Item Response Theory
Or, modern test theory- increasingly replacing CTT
more complex approach to analyzing tests; paradigm that changes how item banks are
developed, test forms are designed, tests are delivered (adaptive or linear-on-the-fly), and
scores produced
CTT works well when assessments utilise a uniform set of questions, CTT is very limited when
creating item-banked assessments
accounts for differences in question difficulty, item discrimination, and guessing, all of
which require parametrisation in item banked assessments
Assumptions
Monotonicity – The assumption indicates that as the trait level is increasing, the probability
of a correct response also increases
Unidimensionality – The model assumes that there is one dominant latent trait being
measured and that this trait is the driving force for the responses observed for each item in
the measure
Local Independence – Responses given to the separate items in a test are mutually
independent given a certain level of ability.
Invariance – We are allowed to estimate the item parameters from any position on the
item response curve. Accordingly, we can estimate the parameters of an item from any
group of subjects who have answered the item.