KEMBAR78
PROFED Assessment in Learning 2 - TOS | PDF | Knowledge | Polynomial
0% found this document useful (0 votes)
20 views180 pages

PROFED Assessment in Learning 2 - TOS

The document outlines the purpose and guidelines for test construction in educational settings, emphasizing the importance of assessments in informing learners and teachers about strengths and weaknesses. It discusses the Table of Specifications (TOS) and Bloom's Revised Taxonomy as tools for ensuring that tests align with educational objectives and cognitive levels. Additionally, it provides tips for preparing TOS and sample questions for various cognitive levels to enhance teaching and evaluation practices.

Uploaded by

s2022100160
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views180 pages

PROFED Assessment in Learning 2 - TOS

The document outlines the purpose and guidelines for test construction in educational settings, emphasizing the importance of assessments in informing learners and teachers about strengths and weaknesses. It discusses the Table of Specifications (TOS) and Bloom's Revised Taxonomy as tools for ensuring that tests align with educational objectives and cognitive levels. Additionally, it provides tips for preparing TOS and sample questions for various cognitive levels to enhance teaching and evaluation practices.

Uploaded by

s2022100160
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 180

Midterm Lessons

PROFED 6 ( ASSESSMENT IN LEARNING)

Imelda P. Oruga
Sept-Oct 2024
What are tests for?

Inform learners and teachers of the strengths and


weaknesses of the process
Motivate learners to review or consolidate specific material
Guide the planning/development of the ongoing teaching
process
Create a sense of accomplishment

Determine if the objectives


have been achieved

Encourage improvement 2
Organized by CCIT and COED in cooperation with HRMD and MIS

Guidelines for Test Construction


Planning the Test

Setting the test Creating a table of


objectives specification

What are
Why How to the levels Why How to What are
define? set? of thinking create? create? the types?
skills?

3
Organized by CCIT and COED in cooperation with HRMD and MIS
Tool used by teacher to
Test blueprint
design a test

a two –way chart which


a chart that provides
describes the topics to be
graphic representations of
covered in a test and the
the content of a course or
curriculum elements and number of items or points
the educational objectives which will be associated
with each topic
Ensures that the instructional objectives and what the test captures match

Ensure that the test developer will not overlook details that are
considered essential to a good test

Makes developing a test easier and more efficient

Ensure that the test will sample all important content areas and
processes

Is useful in planning and organizing

Offers an opportunity for teachers and students to clarify achievement


expectations

5
Organized by CCIT and COED in cooperation with HRMD and MIS
1. The TOS requires a thorough knowledge of Bloom’s
Revised Taxonomy.

2.The TOS requires, as reference, the budgeted lessons


(allocation of time per topic in every grading period with
respect to the desired total number of days / time to be
spent for the grading period.)

3.The TOS requires some simple mathematical computations


that will result to proportional allocation of test items per
topic.
4.The TOS requires that previous experiences are recalled and to some
extent, it requires likewise the imagination of the TOS constructor to
concretize the actual teaching-learning process based on previous
encounters in the classroom in order to determine more or less the
domain/s where he would based from his questions

5.The TOS constructors shall likewise prepare the budgeted lesson to


accompany the TOS from first grading to fourth grading period.
FORMATS OF
TEST TABLE OF
SPECIFICATION

One-way Two-way Three-


TOS TOS way TOS

8
Organized by CCIT and COED in cooperation with HRMD and MIS
ONE-WAY TOS

TOPIC TEST OBJECTIVE NO. OF FORMAT AND No. and


HOURS PLACEMENT OF Percent of
SPENT ITEMS Items

Theories and Recognize the 0.5 Multiple Choice 5 (10.0%)


concepts important concepts in Items #1-5
personality theories

Psychoanalytic Identify the different 1.5 Multiple Choice 15(30%)


theories theories of personality Items # 6-20
under Psychoanalytic
model

etc Organized by CCIT and COED in cooperation with HRMD and MIS

TOTAL 5 50(100%)
TWO -WAY TOS

TIME No. & of Items KD LEVEL OF COGNITIVE BEHAVIOR,ITEM FORMAT


CONTENT SPENT Percent ,NO. AND PLACEMENT OF ITEMS
R U AP AN E C
Theories 0.5 5(10%) F
and concepts C

Psycho-analytic 1.5 15(30%) F


Theories hrs
C

SCORING
1 point per item 2 points per item 3 points per item
OVERALL
TOTAL 50(100%) 20 20 20

* Legend: KD= Knowledge Dimensions ( factual, Conceptual, procedural, metacognitive)

Organized by CCIT and COED in cooperation with HRMD and MIS


Three-Way TOS

Organized by CCIT and COED in cooperation with HRMD and MIS


Table of Specification
COGNITIVE LEVELS
COGNITIVE LEVELS
Number of
Number of % of
% of Number of
Number of Item
Item Remembering
Remembering Understanding
Understanding Applying Analyzing
Applying Analyzing Evaluating
Evaluating Creating
Creating
LESSON
LESSON COMPETENCIES
COMPETENCIES LEVEL OF
OF DIFFICULTY
DIFFICULTY
Days
Days Item
Item Item(s)
Item(s) Placement
Placement LEVEL
Easy (30%)
Easy (30%) Average (50%)
Average (50%) Difficult (20%)
Difficult (20%)
SEQUENCE 1.
SEQUENCE 1. generates
generates patterns.***
patterns.***
2. illustrates
2. illustrates an
an arithmetic
arithmetic sequence
sequence
3. determines
3. determines arithmetic
arithmetic means
means andand
nth term
Arithmetic nth
Arithmetic term of
of an
an arithmetic
arithmetic
sequence.***
Sequence sequence.***
Sequence
4. finds
4. finds the
the sum
sum of
of the
the terms
terms of
of aa
given arithmetic
given arithmetic sequence.***
sequence.***
5. illustrates
5. illustrates aa geometric
geometric sequence.
sequence.
6. differentiates a geometric
6. differentiates a geometric
sequence from
sequence from anan arithmetic
arithmetic
sequence.
sequence.
7. differentiates
7. differentiates aa finite
finite geometric
geometric
sequence from
sequence from anan infinite
infinite geometric
geometric
Geometric
Geometric sequence.
sequence.
Sequence
Sequence
8. determines
8. determines geometric
geometric means
means and
and
nth term
nth term of
of aa geometric
geometric sequence.***
sequence.***
9. finds
9. finds the
the sum
sum ofof the
the terms
terms of
of aa
given finite
given finite or
or infinite
infinite geometric
geometric
sequence.***
sequence.***
10. illustrates
illustrates other
other types
types of
of
HARMONIC 10.
HARMONIC sequences (e.g.,
(e.g., harmonic,
harmonic, Fibonacci).
Fibonacci).
and FIBONACCI sequences
and FIBONACCI
11. solves problems involving
SEQUENCE 11. solves problems involving
SEQUENCE
sequences.
sequences.
12. performs
12. performs division
division of
of polynomials
polynomials
using long
using long division
division and
and synthetic
synthetic
division.
division.
13. proves
13. proves the
the Remainder
Remainder Theorem
Theorem
and the
and the Factor
Factor Theorem.
Theorem.
14. factors
14. factors polynomials.
polynomials.
POLYNOMIAL 15. illustrates polynomial equations.
POLYNOMIAL 15. illustrates polynomial equations.
16. proves
16. proves Rational
Rational Root
Root Theorem.
Theorem.
17. solves
17. solves polynomial
polynomial equations.
equations.
18. solves
18. solves problems
problems involving
involving
polynomials and
polynomials and polynomial
polynomial
equations.
equations.
TOTAL
TOTAL
Tips in Preparing the Table
of Specifications (TOS)
Don’t make it overly detailed.
It's best to identify major ideas and skills rather than
specific details.
Use a cognitive taxonomy that is most appropriate to
your discipline, including non-specific skills like
communication skills or graphic skills or
computational skills if such are important to your
evaluation of the answer.
MATCH the question level appropriate to the level of
thinking skills (Level of difficulty)
What is Bloom’s Taxonomy?

• Bloom’s Taxonomy is a classification of


thinking organized by levels of complexity. It gives
teachers and students an opportunity to learn and
practice a range of thinking and provides a simple
structure for many different kinds of questions.
What is REVISED BLOOM’S TAXONOMY?
The Revised Bloom’s Taxonomy provides the
measurement tool for thinking. The changes in RBT
occur in three broad categories.
• Terminologies
• Structure
• Emphasis
A.Visual Comparison of Two Taxonomies
(Terminology Changes)

Evaluation Creating
Synthesis Evaluating
Analysis Analyzing
Application Applying
Comprehension Understandin
Knowledge g
Remembering

1956 2001

(Based on Pohl, 2000, Learning to Think, Thinking to Learn, p.8)


THE LEARNER IS ABLE TO RECALL, RESTATE AND REMEMBER LEARNED
INFORMATION.
- RE COGNIZ ING
- LISTING
- D E SCRIBING
- IDENTIFYING
- RETRIEVING
- NAMING
- LOCATING
- FINDING
CAN YOU RECALL INFORMATION?
Sample Questions for Remembering
• What is ?
• Where is ?
• How did it happen ?
• Why did ?
• When did ?
• How would you show ?
• Who were the main ?
• Which one ?
• How is ?
THE LEARNER GRASPS THE MEANING OF INFORMATION BY INTERPRETING AND
TRANSLATING WHAT HAS BEEN LEARNED.
- INTERPRETING
- EXEMPLIFYING
- SUMMARIZING
- INFERRING
- PARAPHRASING
- C LAS S IFY IN G
- COMPARING
- EXPLAINING
CAN YOU EXPLAIN IDEAS OR CONCEPTS?
Sample Questions for Understanding
• State in your own words…
• Which are facts? Opinions?
• What does this means…?
• Is this the same as …?
• Giving an example
• Select the best definition
Questions with what, where, why
and how questions answers could
be taken between the lines of the
text through organizing,
comparing, translating,
interpreting, extrapolating,
classifying, summarizing and
stating main ideas fall under
understanding.
•Condense this paragraph
•What would happen if … ?
•What part doesn’t fit?
•How would compare? Contrast?
•What is the main idea of … ?
•How would summarized … ?
THE LEARNER MAKES USE OF INFORMATION IN A CONTEXT DIFFERENT FROM
THE ONE IN WHICH IT WAS LEARNED.
- IMPLEMENTING
- CARRYING OUT
- USING
- EXECUTING
CAN YOU USE THE INFORMATION IN ANOTHER FAMILIAR SITUATION?
Sample Questions for Applying

• How would you organize to show ?


• How would you show your understanding of ?
• What facts would you select to show what ?
• What elements would you change ?
• What other way would you plan to ?
• What questions would you ask in an interview
with ?
• How would you apply what you learned to
develop ?
• How would you solve using what you
have learned?
THE LEARNER BREAKS LEARNED INFORMATION INTO ITS PARTS TO BEST
UNDERSTAND THAT INFORMATION.
- COMPARING
- ORGANIZING
- DECONSTRUCTING
- ATTRIBUTING OUTLINING
- FINDING
- STRUCTURING
- INTEGRATING
CAN YOU BREAK INFORMATION INTO PARTS TO EXPLORE UNDERSTANDINGS
AND RELATIONSHIPS?
Sample Questions for Analyzing
• Which statement is relevant?
• What is the conclusion?
• What does the author believe? Assume?
• Make a distinction between
• What ideas justify the conclusion?
• Which is the least essential statement?
• What literacy form is used?
THE LEARNER MAKES DECISIONS BASED ON IN-DEPTH REFLECTION, CRITICISM
AND ASSESSMENT.
- CHECKING
- HYPOTHESIZING
- CRITIQUING
- EXPERIMENTING
- JUDGING
- TESTING
- DETECTING
- MONITORING
CAN YOU JUSTIFY A DECISION OR COURSE OF ACTION?
Sample Questions for Evaluating
• What fallacies, consistencies,
inconsistencies appear ?
• Which is more important ?
• Do you agree ?
• What information would you use
?
• Do you agree with the ?
• How would you evaluate ?
THE LEARNER CREATES NEW IDEAS AND INFORMATION USING WHAT HAS BEEN
PREVIOUSLY LEARNED.
- DESIGNING
- CONSTRUCTING
- PLANNING
- PRODUCING
- INVENTING
- DEVISING
- MAKING
CAN YOU GENERATE NEW PRODUCTS, IDEAS, OR WAYS OF VIEWING THINGS?
Sample Questions for Creating

• Can you design a ?


• What possible solution to ?
• How many ways can you ?
• Can you create a proposal which
would ?
B. STRUCTURAL CHANGES

Bloom’s original cognitive taxonomy was a


one-dimensional form consisting of Factual,
Conceptual and Procedural – but these were
never fully understood for use by the teachers
because most of what educators were given in
training consisted of a simple chart with the
listing of levels and related accompanying
verbs.
The Revised Bloom’s Taxonomy takes the form of
Two-dimensional table. The Knowledge Dimension or
the kind of knowledge to be learned and second is the
Cognitive Process Dimension or the process used to
learn.

The The Cognitive Process Dimensions


Knowledg
e Remembering Understanding Applying Analyzing Evaluating Creating
Dimensio
ns
Factual

Conceptual

Procedural

Metacognitive
Conceptual Knowledge

- is knowledge of classification,
principles, generalizations,
theories, models or structure
pertinent to a particular
disciplinary area.
Factual Knowledge

- Refers to the essential facts,


terminology, details or elements
student must know or be
familiar with in order to solve a
problem in it.
Procedural Knowledge

-Refers to information or knowledge


that helps students to do
something specific to a discipline
subject, area of study.
It also refers to methods of inquiry,
very specific or finite skills,
algorithms, techniques and
particulars.
Meta-cognitive Knowledge

- is a strategic or reflective
knowledge about how to go
solving problems, cognitive tasks
to include contextual and
conditional knowledge and
knowledge of self.
C. CHANGE IN EMPHASIS

Emphasis is the third and final


category of changes. It is placed upon
its use as a more “authentic tool for
curriculum planning, instructional
delivery and assessment”.
•More authentic tool for curriculum planning,
instructional delivery and assessment
•Aimed at a broader audience
•Easily applied to all levels of schooling
•The revision emphasizes explanation and
description of subcategories.
BLOOM’S REVISED TAXONOMY Suggested Percentage
Allocation
CREATING
Generating new ideas, products, or ways of viewing things
10%
Designing, constructing, planning, producing, inventing

Higher-order EVALUATING
Justifying a decision or course of action 30%
Thinking 10%
Checking, hypothesizing, critiquing, experimenting, judging

ANALYZING
Breaking information into parts to explore understandings and
relationships 10%
Comparing, organizing, deconstructing, interrogating, finding.

APPLYING
Using information in another familiar situation 20%
Implementing, carrying out, using, executing

UNDERSTANDING
Explaining ideas or concepts 20%
Interpreting, summarizing, paraphrasing, classifying, explaining

REMEMBERING 30%
Recalling information
Recognizing, listing, describing, retrieving, naming, finding.
How to Construct Table of Specification

1.Determine the desired


number of test items.
How to Construct Table of Specification

2. List the topics with the


corresponding
allocation of time

•The reference is the


budgeted lesson.
TABLE OF SPECIFICATIONS

Subject Grade Grading period School Year


DOMAINS Total Number
Time
Spent/
Of Test items
Topic
Frequenc Remembering Understanding Applying Analyzing Evaluating Creating Actual Adjusted
y

1. 3
2. 4
3. 1
4. 6
5. 8
6. 5
7. 8
8. 2
9. 4
10. 4
45
TOTAL
50
How to Construct Table of Specification
3. Determine the total number of items per topic by using the formula:
Time Spent / Frequency per topic divided by the total number of
frequency in the grading period times total number of items.

Time Spent / Frequency per Topic Total Number of items


Total Frequency in the grading period
Example:

3
45
50 = 3.33
TABLE OF SPECIFICATIONS

Subject Grade Grading period School Year


DOMAINS Total Number
Time
Spent/
Of Test items
Topic
Frequenc Remembering Understanding Applying Analyzing Evaluating Creating Actual Adjusted
y

1. 3 3.33
2. 4 4.44
3. 1 1.11
4. 6 6.66
5. 8 8.88
6. 5 5.55
7. 8 8.88
8. 2 2.22
9. 4 4.44
10. 4 4.44
45 49.95
TOTAL
50
How to Construct Table of Specification

4. Round off the value to


become whole numbers.
TABLE OF SPECIFICATIONS

Subject Grade Grading period School Year


DOMAINS Total Number
Time
Spent/
Of Test items
Topic
Frequenc Remembering Understanding Applying Analyzing Evaluating Creating Actual Adjusted
y

1. 3 3.33 3
2. 4 4.44 4
3. 1 1.11 1
4. 6 6.66 7
5. 8 8.88 9
6. 5 5.55 6
7. 8 8.88 9
8. 2 2.22 2
9. 4 4.44 4
10. 4 4.44 4
45 49.95 49
TOTAL
50
How to Construct Table of Specification

5. Adjust or Balance by either


adding or subtracting (any of the
topic totals) so that the sum will
amount to the desired number of
test items.
TABLE OF SPECIFICATIONS

Subject Grade Grading period School Year


DOMAINS Total Number
Time
Spent/
Of Test items
Topic
Frequenc Remembering Understanding Applying Analyzing Evaluating Creating Actual Adjusted
y

1. 3 3.33 3
2. 4 4.44 4
3. 1 1.11 1+1
4. 6 6.66 7
5. 8 8.88 9
6. 5 5.55 6
7. 8 8.88 9
8. 2 2.22 2
9. 4 4.44 4
10. 4 4.44 4
45 49.95 49
TOTAL
50
How to Construct Table of Specification

6. Scatter the items per topic per


domain

• Determine the number of items per


complexity / level of cognitive domain.
In this case, we have already a
pre-computed value of 30-20-20-
30 (10-10-10)
TABLE OF SPECIFICATIONS

Subject Grade Grading period School Year


DOMAINS Total Number
Time
Spent/
Of Test items
Topic
Frequenc Remembering Understanding Applying Analyzing Evaluating Creating Actual Adjusted
y

1. 3 3.33 3
2. 4 4.44 4
3. 1 1.11 1+1
4. 6 6.66 7
5. 8 8.88 9
6. 5 5.55 6
7. 8 8.88 9
8. 2 2.22 2
9. 4 4.44 4
10. 4 4.44 4
45 15 10 10 5 5 5 49.95 49
TOTAL
30% 20% 20% 30% ( Higher-order Thinking) 50
How to Construct Table of Specification
7. On the basis of your experience / analysis start allocating the
items with respect to the total number of items per domain and the
total number of items per topic beginning with the higher-order
thinking domains down to remembering. It is suggested that the
order of complexity from creating to remembering is not altered.

• Review the topics, reflect on previous experiences, and imagine


the teaching learning processes (TLP) that can go with the
topics. You may use teaching guides and other similar materials.

• Be mindful of the total points per topic.


TABLE OF SPECIFICATIONS

Subject Grade Grading period School Year


DOMAINS Total Number
Time
Spent/
Of Test items
Topic
Frequenc Remembering Understanding Applying Analyzing Evaluating Creating Actual Adjusted
y

1. 3 3.33 3
2. 4 4.44 4
3. 1 1.11 1+1
4. 6 1 6.66 7
5. 8 2 8.88 9
6. 5 5.55 6
7. 8 2 8.88 9
8. 2 2.22 2
9. 4 4.44 4
10. 4 4.44 4
45 15 10 10 5 5 5 49.95 49
TOTAL
50
Examples of Student Activities and Verbs for Revised
Bloom’s Cognitive Levels (Jacobs & Chase, 1992:19)
Example matrix :
The Knowledge
Remember Understand Apply Analyze Evaluate Create
Dimension

Facts list para-phrase classify outline rank categorize

Concepts recall explains show contrast criticize modify

Processes outline estimate produce diagram defend design

Procedures reproduce give an example relate identify critique plan

Principles state converts solve different-iates conclude revise

Meta-cognitive proper use interpret discover infer predict actualize


Possible reasons for faulty test
questions:

Questions are copied verbatim from the


book or other resources.
Not consulting the course outline.
Much consideration is given to reduce
printing cost.
No TOS or TOS was made after making
the test.
Factors to consider in preparing test
questions (Oriondo & Antonio, 1984)

Purpose of the test


Time available to prepare, administer
and score the test.
Number of students to be tested.
Skill of the teacher in writing the test.
Facilities available in reproducing the
test.
“To be able to prepare a
GOOD TEST , one has to have
a mastery of the subject
matter, knowledge of the
pupils t o be tested, skill in
verbal expression and the
use of the different t e s t
f o r ma t ”
Evaluating Educational Outcomes
(Oriondo & Antonio,1984)
What are the major categories and
format of traditional test?

Selected Response
Constructed
response-Test

Short answer Multiple


test choice

Essay test TF/Alternative


response

Problem - Organized by CCIT and COED in cooperation with HRMD and MIS
Matching
solving test Type
Well constructed test?
enable teachers
motivate students to assess the
and reinforce students mastery
learning of course
objectives

provide feedback on
teaching, often
showing what was or
was not
communicated clearly.

Well constructed tests motivate students and reinforce


learning. Well constructed tests enable teachers to assess the
Organized by CCIT and COED in cooperation with HRMD and MIS
students mastery of course objectives. Tests also provide
feedback on teaching, often showing what was or was not
communicated clearly
Guidelines for Writing Multiple Choice Items

A multiple choice item (MC) is characterized by the


following components:

Stem - the initial part of the item in which


the task is stated.
Options - the set of response choices
presented under the stem.
Key - correct response option
Distractors - incorrect response options

62
Guidelines for Writing Multiple Choice Items

The stem may be a direct question or an incomplete


statement with the options that complete the
statement.

Note: The direct question is generally easier to


develop and to understand.

63
Guidelines for Writing Multiple Choice
Items
From TIMSS 2003

1. The stem has enough information to make the


task clear and unambiguous to students.

First Draft: Solve the equation 25 –X = 19.

Revision: What number should go in the


blank to make the number
sentence true?
25 – ____ = 19

64
Guidelines for Writing Multiple Choice
Items
From TIMSS 2003

2. Do not include extraneous information in stem.


This may confuse students.
First Draft:

Mang Gorio has 180 eggs that he has collected on


his farm. He wants to take them to the market 3 km
away. Before he takes them he must put them in
cartons. Each carton holds 12 eggs. How many
cartons does Mang Gorio need?

65
Guidelines for Writing Multiple Choice
Items

From TIMSS 2003

Revision:

Eggs are packed 12 to a carton. How many


cartons are needed to pack 180 eggs?

A. 13 C. 15
B. 14 D. 18

66
Guidelines for Writing Multiple Choice
Items

From TIMSS 2003

3. Use a direct question rather than a directive in the


stem.

First Draft: Find the area of a rectangle with


sides 2 cm and 6 cm.

Revision: What is the area of a rectangle


with sides 2 cm and 6 cm?

67
Guidelines for Writing Multiple Choice
Items
From TIMSS 2003
4. Include “of the following” in the stem if there is no
universally agreed upon answer to the question.

First Draft: Which is the best conductor of


electricity?

Revision: Which of the following is the best


conductor of electricity?
A. air C. rubber
B. copper D. water
68
Guidelines for Writing Multiple Choice
Items
From TIMSS 2003
5. Make sure there is only one correct or best
answer.
First draft: Which animal is hatched from
eggs?
A. spider C. rabbit
B. snake D. carabao
Revision: Which animal is hatched from eggs?
A. goat C. rabbit
B. snake D. carabao 69
6. Do not provide hint to answers from the options. An
essay or constructed response item may be more
appropriate than a multiple choice.
7. Avoid using trick distractors.
8. Observe the rules of grammar and syntax.
9. Make sure all options are parallel in length, level of
complexity and grammatical structure
Guidelines for Writing Multiple Choice
Items

10. Arrange the options in logical order.


11. Reduce the reading burden in the options by
moving the word/s to the stem.
12. Avoid reference to “you” or “your”.
13. Avoid using “none of these” and “all of these” as
response options.

71
Guidelines for Writing Multiple Choice
Items

14. Avoid the use of specific determiners that qualify the


response options providing clues to the correct options:
“never” and “always” tend to be incorrect
“some”, “sometimes”, and “may” tend to appear in
correct options.

15. Make sure that the stem or options to one question do


not answer another question, or rule out distractors in
another question.
72
TRUE-FALSE ITEMS

True-false items requires


students to identify statements
which are correct or incorrect.
Only two responses are
possible in this item format.
Guidelines for Writing
True-False Items

1. Each statement should


include only one idea.
⚫The idea should be stated in the main point of the item rather than on some
trivial detail.
FIRST DRAFT: The true-false item as seen by Newton takes little time to prepare.
REVISION: The true-false item takes little time to prepare.
2. Each statement should be short and
simple.
FIRST DRAFT: True-false items provide for
adequate sampling of objectives and can be
scored rapidly.
REVISION: True-false items provide for
adequate sampling of objectives.
True-false items can be scored
rapidly.
3. Qualifiers such as “few”, “many”,
“seldom”, “always”, “never”, “small”,
“large”, and so on should be avoided.
They make the statements vague and
indefinite.
FIRST DRAFT: True-false items are
seldom prone to guessing.
REVISION: True-false items are prone
to guessing.
4. Negative statements should be used
sparingly.
5. Double negatives should be avoided.
6. Statements of opinions or facts
should be attributed to some important
person or organization.
7. The number of true and false
statements should be equal whenever
possible.
Matching Items

It is a selection-type of item
consisting of stimuli (or stems)
called premises, and a series of
options called responses.
Guidelines for Writing
Matching Items
1. Include only materials that belong to
the same category.
2. Keep premises short and place the
responses on the right side.
3. Use more responses than premises
and allow the responses to be used
more than once.
4. Place the matching items on one
page.
Guidelines for Writing Short answer/fill in
the blank items.
1.State the item clearly and
precisely so that only one
correct answer is acceptable.
2.Begin with a question and shift
to an incomplete statement
later to achieve preciseness
and conciseness.
3. Leave the blank at the end
of the statement.
4. Focus on one important
idea instead of trivial detail
and leave only one blank.
5. Avoid giving clues to the
correct answer.
Guidelines for Writing
Essay Items
1. State questions that require clear, specific,
and narrow task or topic to be performed.
2. Give enough time limit for answering each
essay question.
3. Require students to answer all questions
4. Make it clear to students if spelling,
punctuation, content, clarity, and style are
to be considered in scoring the essay
questions.
5. Grade each essay question by
the point method, using well-defined
criteria. (rubric)
6. Evaluate all of the students’
responses to one question before
going to the next question.
7. Evaluate answers to essay
questions without identifying
the students.
8. If possible, two or more
correctors must be employed
to ensure reiable results.
Writing Completion Items

• Completion items require the


students to associate an
incomplete statement with a
word or phrase recalled from
memory
Guidelines In Completion Items

• As a general rule, it is best to use ONE blank in a


completion item.
• The blank should be placed NEAR or at the END of
the sentence
• Give clear instructions indicating whether synonyms
will be correct and whether spelling will be a factor in
scoring
• Give clear instructions indicating whether synonyms
will be correct and whether spelling will be a factor in
scoring
• Avoid using direct statements from the textbook with
a word or two missing
• All blanks for all items should be of equal length and
long enough to accommodate the longest response.
Writing Arrangement Items

• It is used for testing knowledge


of sequence and order
Guidelines In Arrangement Items

• Items to be arranged should belong to one


category only.
• Provide instructions on the rationale for
arrangement or sequencing.
• Specify the response code students have to
use in arranging the items.
• Provide sufficient space for the writing of
answer.
Writing Completion-Drawing Items

• It is one wherein an incomplete


drawing is presented which the
student has top complete.
Guidelines In Completion-Drawing Items
• Provide instruction on how the drawing
will be completed.
• Present the drawing to be completed.
Writing Correction Items

• It is similar to the completion


item, except that some words or
phrase have to be changed to
make the sentence correct.
Guidelines In Correction Items

• Underline or italicize the word or phrase to


be corrected in a sentence.
• Specify in the instruction where students
will write their correction of the underlined
or italicized word or phrase.
• Write items that measure higher levels of
cognitive behavior.
Sample

Directions: Change the underlined word or phrase to make


each of the following statements correct. Write your answer
on the space before each number.

1. Inflation caused by increase demand is known


as oil-push.
2. Inflation is the phenomenon of falling prices.
3. Expenditure on non-food items increases with
increased income according to Keynes.
4. The additional cost for producing an
additional unit of a product is average cost.
Writing Identification Items

• It is one wherein an unknown


specimen is to be identified by
name or other criterion
Guidelines In Identification Items

• The direction of the test should indicate


clearly what has to be identified.
• Sufficient space has to be provided for the
answer to each item.
• The question should not be copied verbatim
from the book.
Sample

Directions: Following are phrase definitions of term, opposite


each number, write the term defined.

1. Weight divided by volume


2. Degree of hotness or coldness of a body
3. Changing speed of a moving body
4. Ratio of resistance to effort
Writing Enumeration Items
• An enumeration item is one wherein the
student has to list down parts or
elements/components of a given concept or
topic.

Guidelines In Enumeration Items


• Exact numbers of expected answers have to be
specified.
• Spaces for the writing of answers have to be
provided and should be of the same length
Writing Identification Items

• It consists of a pair of words, which


are related to each other. This type
of item is often used in measuring the
student’s skill in sensing association
between paired words or concepts
Guidelines In Analogy Items

• The pattern of relationship in the first pair


of words must be the same pattern in the
second pair.
• Options must be related to the correct
answer.
• The principle of parallelism has to be
observed in writing options.
• More than three options have to be included
in each analogy item top lessen guessing.
• All items must be grammatically consistent.
Sample
Sampaguita : Philippines – Rose of Sharon : Korea

Bonifacio : Philippines, ; USA

a. Jefferson
b. Lincoln
c. Madison
d. Washington
Writing Interpretive Items

• Writing Interpretative Items


Interpretative test item is often used
in testing higher cognitive behavior.
Guidelines In Interpretive Items

• The interpretative exercise must be related


to the instruction provided the students.
• The material to be presented to the
students should be new to them but similar
to what was presented during instruction.
• Written passages should be as brief as
possible.
• The students have to interpret, apply,
analyze and comprehend in order to answer a
given question in the exercise
Writing Short Explanation Items

• This type of item is similar to an essay test


but requires a short response, usually a
sequence or two. This type of question is a
good practice for the students in expressing
themselves concisely

Guidelines In Short Explanation Items


• Specify in the instruction of the test, the
number of sentences that the students can use
in answering the question.
• Make the question brief and to the point for the
students not to be confused.
When to Use Essay or Objective Tests
Essay tests are appropriate when:
the group to be tested is small and the test is not to be reused.
you wish to encourage and reward the development of student skill in
writing.
you are more interested in exploring the student’s attitudes than in
measuring his/her achievement.

Objective tests are appropriate when:


the group to be tested is large and the test may be reused.
highly reliable scores must be obtained as efficiently as possible.
impartiality of evaluation, fairness, and freedom from possible test
scoring influences are essential
Either essay or objective tests can be
used to:
measure almost any important
educational achievement a written test
can measure.
test understanding and ability to apply
principles.
test ability to think critically.
test ability to solve problems.
Matching Learning Objectives with Test
Items
Instructions: Below are four test item categories labeled A, B, C, and D. Following
these test item categories are sample learning objectives. On the line to the left of
each learning objective, place the letter of the most appropriate test item category.
A = Objective Test Item (multiple choice, true-false, matching)
B = Performance Test Item
C = Essay Test Item (extended response) D = Essay Test Item (short
answer)

1. Name the parts of the human skeleton


2. Appraise a composition on the basis of its organization
3. Demonstrate safe laboratory skills
4. Cite four examples of satire that Twain uses in
Huckleberry Finn
5. Design a logo for a web page
6. Describe the impact of a bull market
7. Diagnose a physical ailment
8. List important mental attributes necessary for an athlete
9. Categorize great American fiction writers
10. Analyze the major causes of learning disabilities
Unless assessment
improves
the teaching-learning
process,
it serves no purpose at all.
POINTS TO PONDER…
A good lesson makes a good question
A good question makes a good content
A good content makes a good test
A good test makes a good grade
A good grade makes a good student
A good student makes a good COMMUNITY
Jesus Ochave Ph.D.
VP Research Planning & Development
Philippine Normal University
What is test reliability?

Reliability-consistency of the
under 3 conditions:

when tested on the same person

When retested on the same measure

Similarity of responses across items


Organized by CCIT and COED in cooperation with HRMD and MIS

that measure the same characteristics


What are the factors
affecting reliability?

Factors affecting reliability

The number of test items

Individual differences
Organized by CCIT and COED in cooperation with HRMD and MIS
External environment
What are the different
ways to establish test
reliability?

Determining the reliability of test

Variable you are measuring

Types of test
Organized by CCIT and COED in cooperation with HRMD and MIS
Number of versions of test
Methods in testing Reliability

Test-retest

Parallel forms

Split-half

Test of internal consistency


Inter-rater reliability
Organized by CCIT and COED in cooperation with HRMD and MIS
Test-retest

How is the
What statistics is reliability done?
used?
Applicable for
Correlate the test test that
score from 1st to measure stable
2nd administration variables

Pearson
aptitude
product
correlation
Organized by CCIT and COED in cooperation with HRMD and MIS
Psychomotor
behavior
TEST RETEST

 Test-retest reliability refers to the extent to


which a test or measure
administered at one time is correlated with the
same test or measure
administered to the same people at another
time.
 If the correlation between separate administrations of
the test is high (e.g. 0.7 or higher as in this Cronbach's
alpha-internal consistency-table), then it has good
test–retest reliability.
CONDITIONS

 the same experimental tools


 the same observer
 the same measuring instrument, used under
the same conditions
 the same location
 Repetition over a short period of time.
 same objectives
DISADVANTAGES

 It takes a long time for results to be obtained.


 If the duration is to brief then participants may
recall information from the first test which could
bias the results.
 If the duration is too long it is feasible that the
participants could have changed in some
important way which could also bias the results .
Split half

How is the
What statistics is reliability done?
used?

Administer the
Correlates 2 sets of test to a group of
scores using Person r examinees.

After the correlation use Items needs to be split


another formula called into halves using the
Spearman Brown Coefficient odd-even technique

Applicable
Organized by CCIT and when
COED in cooperation with HRMD and MIS
the test has large
numbers
SPLIT HALF METHOD

 A test for a single knowledge area is split into


two parts and then both parts given to one
group of students at the same time.
SPLIT HALF METHOD

 Split-half testing is a measure


of internal consistency. How well the
test components contribute to the
construct that’s being measured. It is
most commonly used for multiple
choice tests you can theoretically use
it for any type of test—even tests with
essay questions.
How to Split it half?

 first half and second half


 odd and even numbers.

 If the two halves of the test

provide similar results this would


suggest that the test has internal
reliability.
STEPS

 Administer the test to a large group


students (ideally, over about 30).
 Randomly divide the test questions into
two parts. For example, separate even
questions from odd questions.
 Score each half of the test for each
student.
 Find the correlation coefficient for the two
halves.
Parallel Forms

(also called equivalent forms reliability)


uses one set of questions divided into
two equivalent sets (“forms”), where
both sets contain questions that
measure the same construct,
knowledge or skill. The two sets of
questions are given to the same
sample of people within a short period
of time and an estimate of reliability is
calculated from the two sets.
Organized by CCIT and COED in cooperation with HRMD and MIS
Parallel Forms

Step 1: Give test A to a group of 50 students on a


Monday.
Step 2: Give test B to the same group of students that
Friday.
Step 3: Correlate the scores from test A and test B.

In order to call the forms “parallel”, the observed


score must have the same mean and variances. If
the tests are merely different versions (without the
“sameness” of observed scores), they are
called alternate forms

Organized by CCIT and COED in cooperation with HRMD and MIS


Parallel Forms

How is the
What statistics is reliability done?
used?

Applicable if
Correlate the test there are two
results from the 1st versions of test
form score the to
2nd form
Entrance
exam

Pearson r Organized by CCIT and COED in cooperation with HRMD and MIS
Licensure
exam
Parallel Forms

Advantages:
Parallel forms reliability can avoid some
problems inherent with test-resting.

Disadvantages:
You have to create a large number of
questions that measure the same
construct.
Proving that the two test versions are
equivalent (parallel) can be a challenge.

Organized by CCIT and COED in cooperation with HRMD and MIS


Internal consistency

assesses the correlation


between multiple items in a
test that are intended to
measure the same construct.
You can calculate internal
consistency without repeating
the test or involving other
researchers, so it's a good way
of assessing reliability when
you only have one data set.
Organized by CCIT and COED in cooperation with HRMD and MIS
Test Internal
Consistency
How is the
What statistics is reliability done?
used?

The procedure involves


A statistical determining if the scores
analysis called for each items are
Cronbach’s alpha consistency answered
or the Kuder by the examinees
Richard

Likert scale
Organized by CCIT and COED in cooperation with HRMD and MIS
Inter-Rater Reliability

refers to statistical measurements that


determine how similar the data collected by
different raters are. A rater is someone who is
scoring or measuring a performance,
behavior, or skill in a human or animal.

Examples of raters would be a job


interviewer, a psychologist measuring how
many times a subject scratches their head in
an experiment, and a scientist observing how
many times an ape picks up a toy.

Organized by CCIT and COED in cooperation with HRMD and MIS


Inter rater reliability

How is the
What statistics is reliability done?
used?

The procedure is used


A statistical to determine the
analysis called consistency of multiple
Kendal ‘s Tau raters when using rating
Coefficient scales and rubrics to
judge performance

Multiple raters
Organized by CCIT and COED in cooperation with HRMD and MIS
Phases of preparing a test

▪ Try-out phase
▪ Item analysis phase
▪ Item revision phase
Item Analysis

▪ There are two important characteristics of an


item that will be of interest of the teacher:
🢭 Item Difficulty
🢭 Discrimination Index
▪ Item Difficulty or the difficulty of an item is
defined as the number of students who are able
to answer the item correctly divided by the total
number of students. Thus:

Item difficulty = number of students with the correct answer


Total number of students

The item difficulty is usually expressed in percentage.


Example:
What is the item difficulty index of an item if
25 students are unable to answer it correctly
while 75 answered it correctly?

Here the total number of students is 100, hence,


the item difficulty index is 75/100 or 75%.
One problem with this type of difficulty
index is that it may not actually indicate
that the item is difficult or easy. A student
who does not know the subject matter will
naturally be unable to answer the item
correctly even if the question is easy. How
do we decide on the basis of this index
whether the item is too difficult or too
easy?
Range of difficulty Interpretation Action
index

0 – 0.25 Difficult Revise or discard

0.26 – 0.75 Right difficulty retain

0.76 - above Easy Revise or discard


▪ Difficult items tend to discriminate between
those who know and those who does not know
the answer.
▪ Easy items cannot discriminate between those
two groups of students.
▪ We are therefore interested in deriving a
measure that will tell us whether an item can
discriminate between these two groups of
students. Such a measure is called an index
of discrimination.
An easy way to derive such a measure is to
measure how difficult an item is with
respect to those in the upper 27% of the
class and how difficult it is with respect to
those in the lower 27% of the class. If the
upper 27% of the class found the item easy
yet the lower 27% found it difficult, then
the item can discriminate properly
between these two groups. Thus:
Index of discrimination = DU – DL

Example: Obtain the index of discrimination of


an item if the upper 27% of the class had a
difficulty index of 0.60 (i.e. 60% of the upper
27% got the correct answer) while the lower 25%
of the class had a difficulty index of 0.20.
DU = 0.60 while DL = 0.20, thus index of
discrimination = .60 - .20 = .40.
▪ Theoretically, the index of discrimination can
range from -1.0 (when DU =0 and DL = 1) to 1.0
(when DU = 1 and DL = 0)
▪ When the index of discrimination is equal to -1,
then this means that all of the lower 27% of the
students got the correct answer while all of the
upper 27% got the wrong answer. In a sense,
such an index discriminates correctly between
the two groups but the item itself is highly
questionable.
▪ On the other hand, if the index
discrimination is 1.0, then this means that
all of the lower 27% failed to get the correct
answer while all of the upper 27% got the
correct answer. This is a perfectly
discriminating item and is the ideal item
that should be included in the test.
▪ As in the case of index difficulty, we
have the following rule of thumb:
Index Range Interpretation Action

-1.0 to -.50 Can discriminate Discarded


but the item is
questionable
-.55 to .45 Non-discriminating Revised

.46 to 1.0 Discriminating item Include


Example: Consider a multiple item choice type
of test with the ff. data were obtained:
Item Options

A B* C D
1
0 40 20 20 Total

0 15 5 0 Upper 27%

0 5 10 5 Lower 27%

The correct response is B. Let us compute the difficulty index and index of
discrimination.
The correct response is B. Let us compute the
difficulty index and index of discrimination:

Difficulty index = no. of students getting the correct answer


Total
= 40
100
= 40%, within of a “good item”
The discrimination index can be similarly be
computed:
DU = no. of students in the upper 27% with correct response
No. of students in the upper 27%
=15/20 = .75 or 75%
DL= no. of students in lower 75% with correct
response no. of students in the lower
25%
= 5/20 = .25 or 25%
Discrimination index = DU – DL
= .75 - .25
= .50 or 50%
Thus, the item also has a “good discriminating power”.
It is also instructive to note that the distracter A
is not an effective distracter since this was never
selected by the students. Distracter C and D
appear to have a good appeal as distracters.
Index of Discrimination – is the difference
between the proportion of the upper group who
got an item right and the proportion of the lower
group who got the item right.
More Sophisticated
Discrimination Index
▪ Item Discrimination refers to the ability of an
item to differentiate among students on the
basis of how well they know the material
being tested.
▪ A good item is one that has good
discriminating abilityand has a sufficient
level of difficulty (not too difficult nor too
easy).
The Item-Analysis Procedure for Norm
provides the following information:

1. The difficulty of an item


2. The discriminating power of an item
3. The effectiveness of each alternative
Benefits derived from Item Analysis
1. It provides useful information for class
discussion of the test.
2. It provides data which helps students improve
their learning.
3. It provides insights and skills that lead to the
preparation of better tests in the future.
Index of
Difficulty
Index of Item Discriminating Power
The discriminating power of an item is reported as
a decimal fraction; maximum discriminating power
is indicated by an index of 1.00.
Maximum discrimination is usually found at the
50 per cent level of difficulty.
0.00 – 0.20 = very difficult
0.21 – 0.80 = moderately difficult
0.81 – 1.00 = very easy
Validation

▪ After performing the item analysis and


revising the items which need revision, the
next step is to validate the instrument.
▪ The purpose of validation is to determine the
characteristics of the whole test itself,
namely, the validity and reliability of the test.
▪ Validation is the process of collecting and
analysing evidence to support the
meaningfulness and usefulness of the test.
Validity

▪ is the extent to which measures what it


purports to measure or referring to the
appropriateness, correctness, meaningfulness,
and usefulness of the specific decisions a
teacher makes based on the test results.
There are three main types of
evidences that may be
collected:
1. Content-related evidence of validity
2. Criterion-related evidence of validity
3. Construct-related evidence of validity
Content-related evidence of
validity

▪ refers to the content and format of the


instrument.
🢭 How appropriate is the content?
🢭 How comprehensive?
🢭 Does it logically get at the intended variable?
🢭 How adequately does the sample of items or
questions represent the content to be assessed?
Criterion-related evidence of
validity
▪ refers to the relationship between scores
obtained using the instrument and scores
obtained using one or more other test
(often called criterion).
🢭 How strong is this relationship?
🢭 How well do such scores estimate present or
predict future performance of a certain
type?
Construct-related evidence of
validity

▪ refers to the nature of the psychological


construct or characteristic being measured by
the test.
🢭 How well does a measure of the construct
explain differences in the behaviour of the
individuals or their performance on a certain
task?
Usual procedure for determining
content validity

▪ Teacher write out objectives based on TOS


▪ Gives the objectives and TOS to 2 experts
along with a description of the test takers.
▪ The experts look at the objectives, read over
the items in the test and place a check mark
in front of each question or item that they
feel does NOT measure one or more
objectives.
Usual procedure for determining
content validity
▪ They also place a check mark in front of each
objective NOT assessed by any item in the
test.
▪ The teacher then rewrites any item so
checked and resubmits to experts
and/or writes new items to cover those
objectives not heretofore covered by the
existing test.
Usual procedure for determining
content validity

▪ This continues until the experts approve all


items and also when the experts agree that
all of the objectives are sufficiently covered
by the test.
Obtaining Evidence for criterion-
related Validity

▪ The teacher usually compare scores on the


test in question with the scores on some
other independent criterion test which
presumably has already high validity
(concurrent validity).
▪ Another type of validity is called the
predictive validity wherein the test scores in
the instrument is correlated with scores on
later performance of the feelings.
Gronlunds Expectancy Table

Grade Point Average

Test Score Very Good Good Needs


Improvement

High 20 10 5

Average 10 25 5

Low 1 10 14
▪ The expectancy table shows that there were
20 students getting high test scores and
subsequently rated excellent in terms of their
final grades;
▪ And finally 14 students obtained low test
scores and were later graded as needing
improvement.
▪ The evidence for this particular test tends to
indicate that students getting high score on it
would be graded excellent; average scores
on it would be rated good later; and students
getting low scores on the test would be
graded needing improvement later.
• After performing the item analysis and
Validation
revising the items which need revision,
the next step is to validate the
instrument.
• The purpose of validation is to
determine the characteristics of the
whole test itself, namely, the validity
and reliability of the test.
• the process of collecting and analyzing
evidence to support the
meaningfulness and usefulness of the
test.

171
Organized by CCIT and COED in cooperation with HRMD and MIS
• is the extent to which
measures what it purports
Validity
to measure or referring to
the appropriateness,
correctness,
meaningfulness, and
usefulness of the specific
decisions a teacher makes
based on the test results.
172
Organized by CCIT and COED in cooperation with HRMD and MIS
Types of
Validity

Content Face Predictive Construct Concurrent Convergent Divergent

173
Organized by CCIT and COED in cooperation with HRMD and MIS
When the The items are

Procedure
Content Validity
compared with the
items objectives of the
represents the program. The items
domain being need to measure
directly the
measured objectives(for
achievement) or
definition( for the
scales). A review
conducts the 174
Organized by CCIT and COED in cooperation with HRMD and MIS
checking
When the test The items and

Procedure
Face Validity

is presented layout are


reviewed and
well, free of tried out on a
errors, and small group of
administered respondents.A
well manual for the
administration can
be made as a
guide for the test 175
Organized by CCIT and COED in cooperation with HRMD and MIS
administration.
A measure The correlation

Procedure
Predictive Validity

should predict a coefficient is


future criterion. obtained
Example is an
where the x-
entrance exam
predicting the variable is used
grades of the as the
students after predictor and
the first semester the y- variable 176

as the criterion
Organized by CCIT and COED in cooperation with HRMD and MIS
The Pearson r

Procedure
The components
Construct Validity

or factors of the can be used to


test should correlated the
contain items that items for each
are strongly factor. However,
correlated the is a
technique called
factor analysis
to determine 177
Organized by CCIT and COED in cooperation with HRMD and MIS
which items are
highly correlated
Correlation is

Procedure
When the
Convergent Validity

components or done for the


factors of the test factors of the
are hypothesized test
to have a positive
correlation

178
Organized by CCIT and COED in cooperation with HRMD and MIS
When the Correlation is

Procedure
Divergent Validity

components or done for the


factors of the test factors of the
are hypothesized to
test
have a negative
correlation. An
example to correlate
are the scores in the
test on intrinsic and
extrinsic motivation 179
Organized by CCIT and COED in cooperation with HRMD and MIS
When the Correlation is

Procedure
Divergent Validity

components or done for the


factors of the test factors of the
are hypothesized to
test
have a negative
correlation. An
example to correlate
are the scores in the
test on intrinsic and
extrinsic motivation 180
Organized by CCIT and COED in cooperation with HRMD and MIS

You might also like