All teachers create tests, a type of assessment. Assessment can be used to improve student learning and teacher instruction, as in the case of formative assessment. After instruction, teachers use summative assessment to find out if students have mastered curricula aims. Most teachers create their own assessments for classroom use or use tests created by other teachers. Though creating assessment items (fancy term for test questions) is a fundamental part of classroom teaching, from pre-K to college, it’s a skill that’s seldom taught in teacher education programs or in doctoral seminars on teaching.
The main reason is that it’s a specialized skill. And for each of the two main types of assessment items–selected response (choose an answer) and constructed response (create or construct an answer)–there are various subcategories. Binary choice, multiple-choice, essay items, short answer items …to name some. And each subcategory has its own set of guidelines and rules.
“Teachers can check if a test adequately measures curricula aims through empirical procedures and personal and/or peer judgment.”
Even if teachers from pre-K to college haven’t had formal assessment training, there are ways that they can ensure that the tests they do create adequately measure if students have (or have not) mastered curricula aims. Teachers can check if a test adequately measures curricula aims through empirical procedures and personal and/or peer judgment. Empirical procedures analyze item responses and calculate things like p-values and item discriminator index…but which classroom teacher has time for that. In this post, we focus on ways to use judgment to determine if the test questions you create adequately measure if students have (or have not) mastered curricula aims. There are 4 main ways to do this.
1. Judge the test for your self. Here, the teacher tries to look objectively at the test he or she has created. And asks the following questions:
Are instructions clear? Make sure to provide clear instructions for all parts of the test. Avoid run-on sentences and stick to short active sentences that tell students exactly how to respond to test questions. Do you want students to circle the correct response? Will you accept short answers as well as sentences? Do you want a paragraph of a few sentences and a drawing of the molecule? Let students know.
Another important part of test instructions is letting students know how much time they need to spend on a particular response. The main way to do this is to indicate the amount of points that will be awarded for a correct response. For example:
A) Please answer the following questions using short responses:
1) What is the capital of Trinidad? (1 point)
Response: Port of Spain (OR, The capital of Trinidad is Port of Spain).
The instructions are clear. And the weight of the question, 1 point, supports a concise response. Note that the weight of the question depends on the total score for the entire test. 1 point out of 20 total points is 5%. 1 point out of 5 total points is 20%. The important thing is that the total possible points awarded for a correct response must mirror the type of response.
B) In an essay, discuss the main reasons why there was a change in the capital city of Trinidad from the 17th Century to the 19th Century? (20 points)
Let’s say the test is out of a total of 40 points. This question is worth 50%. The number of points awarded, as well as the term ‘essay,’ tell students what you’re looking for in this response.
Are test questions written in grade-appropriate vocabulary? You want students to focus on responding to test items and not spend time trying to figure out what the item is asking them to do. OR asking you to explain the test item. And reducing the test taking time for the entire class. And annoying students who are in a mental test taking zone.
Is the content on the test current and correct? This is self-explanatory. Are you using the most current text with the most current information? Are you up to date on the latest arguments and ideas in the field? These questions are relevant in all fields, but especially so in the STEM fields where things are always changing.
“If test items contain clues to correct responses, then even students who have not mastered curricula aims may perform well.”
Are there gaps in content? The curricula aim on which the test is based has 3 or 4 major learning outcomes. Did you include them all? Or did you, without realizing it, focus the test on a single learning outcome? Since the test only covers some of the material, student responses will not really indicate who has mastered curricula aim and who hasn’t. In assessment jargon, there will be low content-related evidence of validity.
Are there unintended clues in the test items? Unintended clues include things like an item stated in the negative or in absolute terms. Students know that on a test question, anything that ‘never’ happens or ‘always’ happens is probably the wrong answer.
In questions where students choose the answer, like multiple choice or binary choice (true or false; yes or no), the length of the choice of responses may also create unintended clues. Since the teacher knows the correct response, he tends to spend more time elaborating on the correct response than on incorrect responses. Try to make all responses about the same length.
If test items contain clues to correct responses, then even students who have not mastered curricula aims may perform well. If this happens then the test can’t tell you who knows the material and who doesn’t.
Does the test contain bias? There are two main forms of testing bias. 1) Tests may contain ideas or statements that might be offensive to a group of a particular gender, race, ethnicity, nativity, religion or language. 2) A correct response to a test item may depend on the student having information that is known or understood by persons in a particular group. For example, you craft a math problem that requires an intimate knowledge of cricket. Such a question may be biased against girls, who may not know anything about cricket. Or biased for students from the Caribbean, the UK, Australia, where cricket is a popular pastime.
Indeed, students cannot know every fact required to respond to a test item. Ignorance creates a teaching moment, where students can find out about the things they don’t know. Or be sure to provide enough background information in the test question itself so that students have all the information they need to respond to the question. This is your best bet.
2. Ask other teachers. This is a great way to improve classroom assessment. But it has its limitations. Teacher often have little time to do this sort of thing. Some teachers are a bit scared to share their tests because they’re afraid that the other teacher might… well…steal it. (I’ve seen it happen). Or you’re a new teacher who shows your test to a more experienced teacher and she gives feedback that essentially turns your test into her test.
You really have to know your colleagues and the culture of your institution to tell if this option will work for you. Feel it out. It’s your call. In lieu of asking other teachers, do your own research about the best ways to craft different types of test items. The Internet is a great resource. This resource from Western Washington University tells you how to create various types of assessment items, including essay questions, multiple choice, matching questions etc. An online search will reveal tons more.
“In lieu of asking other teachers, do your own research about the best ways to craft different types of assessment items.”
3. Ask students. Once you have reviewed test responses, ask students what they thought about the test questions. Were they confusing? What exactly was confusing? Did you think that the exam reflected what was covered in the readings and discussions? And listen to what students have to say.
Of course, for this to work, students need to have an established habit of sharing their opinions in class and knowing that the instructor takes these opinions seriously. If not, students may not respond. Or they may say the test was ‘perfect,’ fearing that any kind of critique may have a negative bearing on their final grade. For this to work, students have to trust you, and you have to trust them.
4. Review the responses of students who performed well on the test. You are looking for trends that indicate that something might be amiss with a test question. For example, most of the students who performed well on the overall test, responded incorrectly to question 7. But it seems that students who did poorly on the test, got question 7 correct. There may be something amiss with the question because, students who didn’t seem to master the curricula aim were able to select the correct response. The question may be phrased in such a way that it confused students who understood the material and studied for the test.
Or it seems that both high performing and low performing students got question 7 correct. It may be that question 7 contains an unintended clue to the answer.
There are many possible scenarios for trends that might occur. The main idea is that if you notice something weird about the way high performing students responded in comparison to low performing students, you should take another look at the test question. And ask yourself some the questions outlined in #1.
When I discover that a test question was confusing to most students or poorly constructed, I give the points to all students who attempted the question. Why? Because a correct response didn’t depend on mastering curricula aims. It had more to do with guessing or luck, because the item was poorly constructed.
That’s all folks. #heidiholder #redloheducation I’d love to hear any ides you have for improving classroom tests in the comments below.