I'm Dan and this is my blog. I'm a former high school math teacher and current head of teaching at Desmos. He / him. More here.


  1. Question #2 and #3 are so much superior to question #1 for so many reasons.

    I’m pretty sure that #3 has 2 more additional concepts (shouldn’t start with a 0, and the number should be odd).

    Now if any of these questions could provide a context for why we would CARE about the relative size of numbers, that would be something else. Are these questions useful as assessment questions because they are devoid of context? Should we use a slightly different variation on them for use in our classrooms?

  2. It all depends on what you want from the question. Question 1 is actually better than the other 2 at determining if a student can determine the relative size of numbers (although, with any multiple choice question they have a 25% chance of guessing the correct answer, so a far better question for this purpose would be to have them write it ala #2). If a student gets #3 wrong you would need a whole set of other diagnostics to determine where they went wrong… do they think 02583 is a number? Did they give an even number? did they not know that there was a smaller 5 digit # to be found? Who knows? On the other hand, # 2 and # 3 are better at many other things. So as with any problems, the question has to be a response to “What do you want to know and why?” Is #3 far more interesting, will a student learn more by doing it? I think so, but 3 questions out of context are pretty hard to evaluate.

  3. I don’t think it makes sense to ask which question is superior. I think it is important to have questions that measure simple skills in addition to ones that measure more complex problem solving.

    If you have both then you can do some analysis on whether the students are having trouble with the basic concept or the critical thinking.

  4. I think the MA test has better designed questions in general, but comparing questions on one topic from each test smacks of politicization (don’t identify the states if you simply want to compare the questions). We in the math community especially should appreciate the need for thoughtful and objective analysis and resist stooping to this level.

    A test for the entire population should check for understanding at both a high level and a relatively low level. It make sense to compare high level question on each test (if they exist) and to compare low level questions on each test. Hear we are comparing what was probably an easier question on the CA test with what was probably not one of the easier questions on the MA test.

    The follow up question in the MA test seems out of place and clumsily worded. You just asked a kid to do a task that clearly indicates that they understand place value. If you want to check how well a kid can articulate their understanding of a topic, make it a separate question were it is very clear what you want them to explain. Understanding a topic and being able to articulate that understanding are not at all the same thing.

    I like the Singapore question better because it is more concise and requires students to think about two different ideas (which is often the case with their qustions).

  5. I think it’s debatable whether 02583 is a number. Are zip codes not real numbers? I could see it not being considered five digits somehow, but that’s rules-lawyering to an extent I feel uneasy about the question.

  6. Great comparison.
    I’m nit-picking at the Singapore question, but if they are going to bold and underline “odd” they should also bold and underline “5-digit” as it can equally trip kids up.

  7. I decided to test the claim of cherry-picking an easy problem by reading the rest of the source. Here’s everything having to do with place value in the California test.

    Which set of numbers is in order from greatest to least?
    A 147, 163, 234, 275
    B 275, 234, 163, 147
    C 275, 163, 234, 147
    D 163, 275, 234, 147

    Which number has a 4 in the tens place and a 4 in the hundreds place?
    A 6424
    B 6244
    C 4462
    D 6442

    Which digit is in the hundreds place in the number 3174?
    A 1
    B 3
    C 4
    D 7

    What does the 3 represent in the number 3051?
    A 3
    B 30
    C 300
    D 3000

    Which number has the same digit in both the ones place and the hundreds place?
    A 3308
    B 4118
    C 5977
    D 6242

    Sophie has 527 seashells in her collection. Which of these equals 527?
    A 52+2+7
    B 5+20+700
    C 500+20+7
    D 500+200+70

    @Mike: I think it is important to have questions that measure simple skills in addition to ones that measure more complex problem solving.

    If the California test has any complex problem solving, it isn’t demonstrated by the source. I grant the source itself might be cherry-picked, so if you have a superior example I’d welcome seeing it.

  8. Ah, but CA is also trying to test if kids read the instructions correctly, “… from greatest to least.” I don’t know about 3rd grade math textbooks, but my 6th grade text has all [that I could find] questions asking “from least to greatest.” Tricky.

  9. having world-class standards doesn’t mean that the test questions are world-class. one does not necessarily lead to the other, which is something that plenty of decision-makers don’t yet understand.

  10. Thanks for posting the other questions. I don’t think the source demonstrates much higher order thinking.

    My comment about needing both stems in part from thinking about the interview with john sweller you linked to earlier. He claimed that no one had been able to show that any curriculum improved problem solving skills, so we should just teach basic schema.

    My thought is that we hardly ever test for improvement in problem solving or higher order thinking. In order to do it, we would need to test both basic skills and complex ones, come up with a predicted correlation, and then see if it changes after trying to teach problem solving.

    I think tests usually focus on one or the other, but I do believe the sweller is correct in saying it is hard to solve complex problems without the basic skills. Hence, we should test both and see where the shortcomings/improvements are.

  11. The item marked as being from Singapore is actually from Hong Kong, along with the other items in the AIR report.

    It’s a good report, and interesting to see that if Massachusetts were a country it would be #4 in TIMSS.

    The Massachusetts state exam mostly does a good job of asking students for higher-level thinking, and maybe this has some impact on curriculum practices in the state. But then again, kids were doing alright in MA before the testing crusade swept through in the mid-90s.

    – Bowen

  12. @Bowen, MA was doing a little better than the US through 2000 when they began revising many aspects of their education system. Since then MA has consistently and dramatically outpaced the US in inreasing their NAEP scores.

  13. Different Dave

    April 30, 2012 - 1:12 pm -


    You made me think… If we’re going to be doing assessment, and we’re going to be doing it on computers, then why not ask problem #3. If they get it wrong, then give them other problems to narrow down which specific skills they’re having trouble with (or if it was just carelessness).

    It would be pretty cool, but what a brutal test to take — each time you struggle with a problem, you’ll be faced with several related problems.

  14. But, different Dave, that is the premise of the computer-based adaptive tests that are promised to us in Smarter Balanced. One can only imagine that this series of adaptive tests will pinpoint with precision, after mere hours of such tests, exactly what thinking is lacking in our students (sarcasm). And just as my students do with the NWEA tests, the testees will discover that as long as they deliberately choose the wrong answer on each question, they will soon get the computer to give up. I have students who can be out of the math test in under 10 minutes. The reading test is harder to zero out on, because you might accidentally choose the “right” answer.
    I can’t wait to find out how to teach thinking, or indeed how people learn to think, or how that occurs. I imagine in-school scenarios like MacGyver, with imminent death or dismemberment encouraging innovative ideas. He probably had to have some prior knowledge though. Darn!

  15. Seems to me that there’s no compelling argument in favour of multiple-choice questions over short-answer ones in testing any aspect of math, except for *very* basic procedural skills.

    Dave (12), no doubt Singapore has something to learn from, but here’s an interesting diagnosis of that singaporemath.com site you mention, http://math.berkeley.edu/~giventh/diagnosis.pdf

  16. A bit of a nit-pick, but I wonder: why don’t the MA and HK questions say “make the four-digit number with the smallest possible value” and “form the smallest possible 5-digit odd number”? I find the existing wording a tad distracting (given that the smallest 5-digit odd number is 10001, for example), but maybe that’s just me.

  17. Roy, that’s kind of the point. It’s not possible to form 10001 with the given digits.

    Keep in mind, folks, that the dropout rate in high school in some California public school systems is 60%.

  18. I actually found the California question difficult to read, because the numbers were so close together and there were so many of them. They just sort of jumbled together, and I’m not in the least dyslexic or discalculic!

  19. I agree with Ebear. What do we want from each question? What standards or objectives or goals or whatever are these questions tied to? Are they all the same?

    Also, at what level are all three full tests designed to give valid scores or ratings? (individuals, classroom, school, state?) Are the tests meant to be formative or summative? I would think this makes a difference if you expect to get diagnostic information for individual students.

    Finally, I’m curious how the MA and HK items are scored. I’ve scored open response items for KY (almost 20 years ago), and it was an interesting, although unintentional professional development experience. It was the beginning of my obsession with looking at student work. I’ve also seen open response items on local CA tests for which students could not get the top score unless they used a method specified in the corresponding standard. The choice of how to reliably score these items is a big deal.

  20. Johanna Langill

    May 1, 2012 - 7:58 am -

    The Hong Kong test is structured so that it’s not testing how well students react to a dense paragraph, but rather the instructions.

    I can see reasons for having lots of instructions, but I found the MA item much less to the point- it seemed like it was testing reading comprehension at least as much as mathematical reasoning. While that is important and appropriate for some questions, I would be left wondering if my students had trouble reading the directions, or if they had trouble completing the task.

  21. Jason brings up a good point about the released test questions from CA possibly being cherry picked, but I think the document itself demonstrates that the questions were chosen because they were indicative of the way the standards would be assessed.

    “In selecting test questions for release, three criteria are used: (1) the questions adequately cover a selection of the academic content standards assessed on the Grade 3 Mathematics Test; (2) the questions demonstrate a range of difficulty; and (3) the questions present a variety of ways standards can be assessed.”

    Fawn also points out the “tricky” nature of the instructions. I’d also add that Mass. and Hong Kong use the word “smallest” as opposed to CA’s use of “least” and “greatest.” I wonder what the consequences of word choice may be.

    I find the language in the standards themselves quite interesting in that Mass. has placed an emphasis on multiple representations as well as understanding yet CA simply wants students to perform a specific task.

    3.N.1 Exhibit an understanding of the base ten number system by reading, modeling, writing, and interpreting whole numbers to at least 100,000; demonstrating an understanding of the values of the digits; and comparing and ordering the numbers.


    3.NS1.2 Compare and order whole numbers to 10,000.

    Keep in mind that CA Standard 3.NS1.2 isn’t considered a “key standard” in that it is a support standard for the larger 3.NS 1: Students understand the place value of whole numbers. However, if you look through the entire document, there aren’t many (if any) questions assessing a thorough understanding of the content. And in my opinion, this has had a detrimental effect on how math is actually being taught in CA.

  22. @David:

    I find the language in the standards themselves quite interesting in that Mass. has placed an emphasis on multiple representations as well as understanding yet CA simply wants students to perform a specific task.

    I read from one essay (I don’t have the link offhand, alas, but I’m pretty sure it was Wurman from the essay Dan linked to) the fact that the CA standard is easy to read and the Mass. standard (and the Common Core standard) is hard to read makes the latter evil.

  23. Jason:

    the fact that the CA standard is easy to read and the Mass. standard (and the Common Core standard) is hard to read makes the latter evil.

    I’m not sure that “easy to read” should be the goal. The problem with the CA standards being so easy to read is that the way they are assessed is so locked-in that math has been reduced to a series of discrete skills.

    By the way, I’m not exactly sure why Wurman’s opinion matters much here. I get that he was part of the team that put together the state framework and was appointed by the governor to serve on the content standards commission However, the California Math Framework has a lot of names in the credits many of them preceded by “Dr.” and followed by words like “Stanford” and “UC Berkeley”.

    What do the math ed folks have to say about CCSS?

  24. In working with dyslexic students, it’s really apparent that multiple choice questions just hide their abilities. I’m not a fan of multiple choice at all. Aside from that, each question measures some level of knowledge, but after using Asian-based curriculums, I’m blown away by the higher order thinking that is developed throughout them. It’s a completely different approach than I was exposed to and have used in teaching previously. I am over the top impressed with the conciseness and the thinking required.