Let’s Play A Game

Choose any of your sections. Now, before you hand out your next objective assessment, write down a quick prediction of every student’s exam grade.

Grade them and compare the results.

Deduct points for every score you mis-predicted, one point for every letter grade you flopped. For instance, if you guessed James would score a B and he flunked, deduct three points. You may not deduct four points; you may only sock yourself in the nose.

There is no winning in this game, there is only less losing.

Of the 27 students who took Wednesday’s concept quiz,

  • I guessed 7 grades correctly.
  • I overestimated 18 grades.
  • I underestimated 2 grades.
  • My score was -30.  (Off by 1.1 grade per student.)
  • James wasn’t a hypothetical student.
  • I blew it with cone surface area.

Questions of what all this means, what constitutes a good score (-30 does not, frankly), whether a good score off this metric should even matter to teachers, and, morally, what it means for the teacher to predict a failing grade, are left as exercises to the reader, who, the writer hopes, won’t be stingy with commentary.

I never said it was a fun game.

(May as well officialize it. If you teach math, consider yourself memed.)

I'm Dan and this is my blog. I'm a former high school math teacher and current head of teaching at Desmos. He / him. More here.


  1. Ok. I have three sets of tests to grade this weekend, I’ll play all 3 (though one is Logic with a group I have not had, at least not for this sort of subject, before)

    Algebra 1 for 18 (percents, work problems, fractional eqns)
    Algebra 2 for 16 (mixed polynomial stuff, from factoring through remainder theorem).

    Since I don’t give letters, if I guess 92 and the result is 88, what do I deduct? If I guess 81 and the result is 88?

  2. Dan:

    I’m willing to take your challenge on Tuesday when I give the next exam. But I have done one better (I think).

    At the beginning of the year, after a few days of school, I predicted the final quarter one grades of my students. I have the results at school, and I promise to post them. Okay, I promise to post them if I remember to.

  3. I gave an exam yesterday and luckily I looked at this before I started grading.

    Of 41 students that took the test:

    – I guessed 16 grades correctly
    – I overestimated 21grades
    – I underestimated 4 grades
    – My score was -33 (off by 0.8 grade per student)

    As to the moral/ethical issues around this game–I’m of multiple minds. It seems useful for teachers to have realistic knowledge of their students’ abilities, so the lower the score the more a teacher is understanding his or her students. On the other hand, what does it mean that we write a test knowing that some people will probably fail it? Obviously, the responsibility is not all ours, but I tend to take more for responsibility for those failures than I give to the students.

    On the other hand, I strongly believe that grades on tests are mostly a measure of students test-taking skills and only barely an indication of how much of the material they understand (and this is ignoring issues of grader subjectivity, which makes the results even less revealing of students’ content abilities). With this in mind, a failure on a test is simply an indication that the student needs to improve their test-taking on that particular kind of test. Many factors can be involved, including outside stresses, confidence, cultural and language issues, and more. See also the very important work that Claude Steele is doing on “Stereotype Threat” at Stanford for more on the effects of cultural differences in testing situations.

    One way I try to deal with all this is to give my students the ability to choose the percentages that they assign to homework, quizzes, and tests (within ranges; e.g., in my algebra class students must choose a test percentage–not including the final exam–between 20% and 40%). This can greatly reduce the stress and importance of tests, while still helping the students improve their test-taking skills for future classes or situations where tests may be more high stakes. Looked at in this way–as preparation for the future, but not as a measure of what students necessarily understand about the material they are studying–it is perfectly fine for teachers to know that some students are going to fail a test. The choice for how a student is graded also creates the opportunity to talk to students about what they do well, what they don’t, and why. It can be a great meta-cognitive moment. I’m not saying the method is perfect, but I like it overall.

  4. Jonathan: My own system doesn’t lend itself to this game either so I just assigned A, B, C, D, and F to the standard scale (≥ 90 = A, ≥ 80 = B, etc.) Curious how this plays out for you.

    Hal: Way to claim the leader board. More importantly, though, I dig your sliding scale for assessment. I’m not one to let tech limitations get in the way of a good idea but I am curious which gradebook program you’re using that lets you individualize weighting. That’s great.

    I also agree that a test score measures how well a student takes a test, but to a lesser degree than you do. After encountering basic, intermediate, advanced, and application problems on a given concept, a student might know the content, but be blindsided by the format. I make sure to keep the format consistent across the multiple times I assess content for just the reason you suggest. I can’t ensure they had a good night sleep or that everything is fine at home, but I can reduce formatting interference.

  5. Dan: I like the consistent approach that you mention; it definitely helps deal with the issue of students learning to take the test better and with reducing formatting interference, as you say. However, that is only true as long as the student stays with you. What happens when they go on to the next teacher?

    I use an excel spreadsheet for my grades because it gives me a lot of flexibility. And, yeah, I did spend a hour or so setting it up the first time, but once I did it I could use that template for all the subsequent classes.

    One other issue came to me about the “game”. Taken another step or two further, we could be making a “game” of guessing our students final grades after the first day of class. This seems to me flatly amoral: the twin dangers of influencing the results through our behaviors and of the students possibly finding out your guesses are the first of several bad possible results. Of course, we already make these kinds of judgments to some degree so it’s possible that by making them explicit we could deal with them in ourselves, but I think the potential for bad is greater than the potential for good.

    I realize that the test game is qualitatively different from the first-day game because it is based on our experiential knowledge of students rather than quick first impressions, but the more obvious dangers of the first-day game make me very suspicious of the test game’s morality.

  6. “This seems to me flatly amoral…”

    You mean ‘immoral’, not ‘amoral’, don’t you?

  7. I love this. I haven’t taught in quite a few years, but this ‘game’ seems to me to be an invaluable exercise in expectations. very important for a teacher to see by how much his or her expectations are right or wrong!

  8. OK, I like the game. I don’t have a test coming up for another week or so (just a quiz this week) but I’ll play when the next one rolls around.

    Deeper questions here: How should we understand the effects that our expectations, even unspoken, have on students? What are tests really measuring? And, casting its shadow over all of this, always: What should we do about the students who aren’t getting it, who we know aren’t getting it, and who we can even predict won’t get it?

  9. I’m curious how our IB math teacher would reply to these allegations of negative expectation. What possible good is there in predicting grades a quarter out? Let’s simmer the dogpile a bit, though. It’s the mediocre teacher who can’t recognize there are archetypal students in our midst, though more mediocre still is to cling to those archetypes and let them reify instruction.

    Perhaps it’s purely semantics but I cringe a bit when Hal says, “… what does it mean that we write a test knowing that some people will probably fail it?” I hope it doesn’t mean we should make our tests less rigorous.

    The point of this game, from my perspective, is to maximize the value of assessment. Hal and I both overestimated our student’s … well … from Hal’s vantage point, he overestimated how well his students take tests. Personally, I overestimated how much my students know, particularly regarding cone surface area.

    This is valuable.

    We may dispute just how valuable, but with finite review time, with finite opener space, it’s valuable that I know the greatest disconnnect between my students, their teacher, and our content right now is with cone surface area. That sort of data makes me feel like Rambo with innumeracy standing in for the smarmy small-town sheriff.

    The issue of expectations is a monster, particularly when you predict a failing grade. I’m convinced that predicting an F only one week in advance of the assessment is benign, expectations-wise. It does the raise the question, though: what are you doing about it? How are you remediating, soldier?

    Is the student frequently absent or tardy? Then have you called the parents or contacted administration?

    Is the student clueless to the finer points of probability? Then have you at least invited her in for tutoring, if not coerced her into it with an academic referral?

    Predicting failiure, as the esteemed members of this conversation have pointed out, is where the game gets serious. Now what’s your play?

  10. Hal:

    Yes, predicting grades out a quarter in advance is immoral if you really believe your expectations. However, I have no idea a few days later who I had high expectations for and who I had low expectations for because I “graded” 160 kids after seeing them for two days. After all, I had a assigned seating chart, and at the end of the second day of class, I made a note of who seemed to be answering the first day’s set of homework questions, who was asking questions, who seemed to be daydreaming, etc. I always do this because I need to know who are the few kids that I will need to keep a careful eye on as the class progresses. After my first quiz, I know what kids that I need to go to first when they are working on their classwork. After the second quiz, I have more information, and can adjust the list of “underperformers”. This list gets more representative of who needs help as the quarter progresses. To think that I have this down after two days is silly….I know that, and I certainly don’t put any faith in this list.

    I’m absolutely convinced that it is very important to come into a new group of students with no prior knowledge about their past experience with school. For instance, I never talk to other teachers about the kids on my roster at the beginning of the school year. Only once I realize I’m having problems with a kid do I go to their previous math teacher and talk about what happened in their class and what, if anything, can be done about the kid in my classroom. Will a call home result in him getting his homework done more often? Will I get more attention when presenting a lesson by sitting him in the front of the room? Will a referral to the principal’s office help with behavior concerns or will it make it worse?

    In a sentence, don’t assume that a kid cannot do something, but when you find he can’t, then help him get there. Except with the surface area of a cone…then you’re wasting your time!

    With all that in mind, I did just post my predicted grades and their final quarter 1 grades on my blog.

  11. Stephen Humphrey

    March 5, 2007 - 8:55 pm -

    The problem with the game seems to be that it equates overestimating and underestimating. Morally, these aren’t the same thing. Especially when guessing out a whole quarter in advance, there should be a positively correlated score for underestimating, as it suggests the teacher has identified some trait about an expected under-performer and then modified the lessons to overcome that trait. Similarly, overestimating should lower the score, since it seems to suggest the teacher needs to reassess what’s happening in the classroom to underserve someone with great potential. I believe in the power of incentives, so the game should reward the teacher who tries to overcome initially low guesses.

  12. I would agree with Stephen’s argument that a teacher should be rewarded with overcoming initially low guesses so long as the guesses weren’t made by the teacher themselves.

    Stephen is using economic thinking to tackle a problem, but then there would be an economic incentive for the teacher to purposely lower their guesses of student outcomes.

    We know that black, Latino, and Native American kids score lower than others. We know that boys have lower grades than girls. We know that kids from low socio-economic status do worse than others. If teachers can overcome any of these attributes to get scores up to where we would like them to be, they should be heavily compensated (and copied).