The Smarter, Balanced Sample Items

The Smarter, Balanced Assessment Consortium:

Five swimmers compete in the 50-meter race. The finish time for each swimmer is shown in the video. Explain how the results of the race would change if the race used a clock that rounded to the nearest tenth.

You should take a tour through the Smarter, Balanced Assessment Consortium’s released items, make an opinion about them, and share it here. California is a member state of SBAC, one of two consortia charged with assessing the Common Core State Standards, so I’m comparing these against our current assessments. Without getting into how these assessments should be used (eg. for merit pay, teacher evaluation, etc.) they compare extremely favorably to California’s current assessment portfolio. If assessment drives instruction, these assessments should drive California’s math instruction in a positive direction.

The assessment item above uses an animation to drive down its word count and language demand. It’s followed by an expansive text field where students are asked to explain their reasoning. That stands up very well next to California’s comparable grade five assessment [pdf]:

  • Elsewhere, we find number sense prized alongside calculation (here also) which is a step in a very positive direction. (ie. Our students should know that $14.37 split between three people is between $4 and $5 but it’s a waste of our time to teach that division longhand.)
  • I’ve been assuming the assessment consortia would run roughshod over the CCSS modeling practice but on the very limited evidence of the sample items, we’re in good shape.
  • The assessments do a lot of interesting and useful things with technology. (Reducing word count, at the very least.) I only found one instance where the technology seemed to get in the way of a student’s expression of her mathematical understanding.

I can’t really make an apples-to-apples comparison between these items and California’s current assessments because California currently has nothing like this. No constructed responses. No free responses. No explanation. It’s like comparing apples to an apple-flavored Home Run pie.

Featured Comment:

Candice Frontiera:

Next thing to explore: Technology Enhanced Item Supporting Materials [zip]. [The “Movie Files” folder is extremely interesting. –dm]

I'm Dan and this is my blog. I'm a former high school math teacher and current head of teaching at Desmos. He / him. More here.


  1. I am actually sitting in a meeting at this very moment where we’re looking at these items. I am of the opinion that these questions are a significant improvement over what we’ve been using. I have concerns about how the test data will be used, and I also have concerns about the fact that the student writing components will be computer-scored. (There’s something “Brave New World”-ish in the fact that our students’ writing capabilities will be judged by artificial intelligence.)

    If somehow we could use these questions and the data they produce as tools instead of weapons, this would be a significant step forward in my opinion.

  2. Thanks for bringing these up. Intriguing! And makes me more optimistic than I have been in a while.

    Now, about the item where you thought the tech got in the way. I had a different take on it: I though the tech provided a detour around showing understanding. That is, the easy solution (have each calculator compute the sales tax for something costing $1000) shows what I think of as good compu-techno-problem-solving habits of mind, but doesn’t necessarily show understanding of proportions or percentages.

    @Mark: I tried to enter a free response, and was told by our computer overlords that it would be scored manually.

  3. I completely agree that the technology used in these test items are both useful and interesting. I’m willing to bet that a very small percentage of math teachers are currently testing students by using technology in this manner. Teachers will need to begin implementing technology into their assessments soon in order for students to be prepared for these types of tests.

    If anyone knows of any textbook companies or other resources that have more test questions like this available, please share!

  4. Tim — the word I’m getting is that humans will perform “spot checks” but the bulk of writings will be scored by computers. I’d love someone in the know to chime in here to clarify.

  5. I know it’s not the intent of this post, but I don’t really like the swimming question. I don’t think the clocks with only a tenth of a second accuracy actually “round” anything. So, it would be more of a floor function gong on (eg 10.29 would show 10.2 on the clock).

  6. Dennis Ashendorf

    November 1, 2012 - 9:46 am -

    Yes, the swimming question is good, but did you review the incredible “Field Test” example? Wonderful series of questions, but is this “reliably” doable? Wow. I was stunned.

  7. Did anyone else notice that times given don’t correspond to the order of the finishers in the video? For example, the bottom swimmer never passes the one in the lane above him, yet finishes in a shorter time.

    This is a great way to assess students, but there should still be attention to detail. A bright student could look at this and be very confused!

  8. @calcdave When I first saw the swimming question, my track and field spidey-sense perked up. In track, we would never round off to the nearest tenth. In fact, the national federation rules are that all hand times (those timed on a stopwatch to the nearest tenth) are always rounded up, so that a times of 13.21 on a stopwatch would be an official time of 13.3. It’s a small thing though, and the problem is clearly worded.

    The grade 6-8 “calculator” problem to me, is awkward to the point where I could see a student becoming lost in the instructions, rather than working through the intent of the problem, which is to identify a correct tax rate.

    I have seen a few problems now where the mechanism used to have students explore a concept could cause students to become lost. At a conference this past weekend, John Mahoney walked us through a PARCC problem where sliders are used to manipulate the parameters of a quadratic. The sliders are not labeled, and the mechanism is required for students to display the reflection of a function.

  9. Lots of good questions… That look nothing like what I typically see in textbooks in terms of both content and format.

    Agree with Zach about the swimmers. Also, the animation really seems to have no purpose in this question, as opposed to some of the other questions where it helps to clarify the question.

    I actually like the sales tax calculator question. The instructions are somewhat vague and you need to tinker around a bit to see how something behaves – very mathematical thinking involved.

    One or two questions asking the student to “explain” actually seemed to warrant an explanation: You have bags with 50, 58, & 53 lbs of sand. Explain why can’t you move sand around to have 50 lbs in each?

    It just seemed ridiculous to ask the student to “explain” in other instances. A rectangle has a perimeter of 20 ft and length of 6 ft, find the width and explain how you got your answer. Does this just mean show some work or do the actually want an explanation? In another case it asked the student to calculate and explain which car will cost the least to buy (factoring in cost, mileage, & repairs). In fairness, we are told the car buyer drives “at least” 200 miles per week, so you could explain your assumptions for mileage, but the request for an explanation really seems a bit forced in this and a few other questions that are essentially computations.

    On the questions where you pick 3 numbers that add to more than 20 you would have a good chance of getting it right by guessing (13/20 I think).

    At first I thought some of these were avoiding multiple choice just for the sake of avoiding multiple choice. But, the alternate formats really do reduce the amount of language (math or english) on the page.

  10. I also am optimistic after checking out these items. As much potential as I see in Common Core, we all know that the assessments really drive instruction…unfortunately. My only fear is that a lack of student success on these types of items is going to lead to backlash and backsliding as it often does.

  11. An encouraging leap forward from an item/assessment perspective.

    As for tradeoffs, there’s cost. These will be significantly more expensive to grade and administer, no?

    Another might be clean measurement- the old test generally asked questions in a straightforward way. Examples from the linked PDF:

    -What is the prime factorization of 36?
    -What is 50% of 40?
    -What is the decimal equal to 3/5?

    Like them or not (I don’t), how a student answers gives you good information on her understanding of the question. For the swimming example above, there’s a bit more noise.

    If one end of the spectrum is worthwhile assessment and the other is clean measurement, let’s keep moving towards the former. Thanks for the post/analysis.

  12. These tasks are so far beyond what I was expecting – that’s a good thing. They actually support the Mathematical Practices, which is what we’ve been told they would do. It’s really all about the 4 claims that SBAC has made about what they can assess: concepts & procedures, problem solving, communicating reasoning, modeling & data analysis. See the website:

    Most assessment systems currently in place only test concepts & procedures since this is what can easily tested with multiple guess and machine scoring. Many of the SBAC items can be scored automatically, but not all. Those items come with rubrics and (I’ve heard) will be scored by classroom teachers. This happens every year with AP exams and used to happen with my state assessment (we used to have constructed and extended response items). Teachers who participate in these scoring sessions gain valuable insight into student thinking (duh), but it’s an expensive proposition.

  13. I am really impressed. In Massachusetts, we do have open response questions, but I do not think they are as cognitively demanding as these questions. So far, the released items from PARCC do not seem as demanding, but we have only seen a small sample from PARCC.

  14. Candice Frontiera

    November 2, 2012 - 3:28 pm -

    Next thing to explore:
    Go to this site:

    Scroll down until you see the link for “Technology Enhanced Item Supporting Materials (ZIP)”

    It has both Math and ELA videos and templates of how students will use the tools to label number lines, partition shapes, create angles, etc. The folder is a hidden gem. I totally agree that the possibilities this type of testing offers is a step in the right direction!

  15. Hi Dan:
    This is such a good question!
    These types of machine-scorable questions address students’ basic skills and understanding by applying the Cognitive Processes, which are based on a revision of Bloom’s Taxonomy by Anderson and Krathwohl (2001). From what I understand, the University of Iowa, with the goal of developing machine-scorable questions that address lower- and higher-order thinking skills, supported the creation of these types of questions. Samples of these questions, designed by Scalise and Gifford (2006) are available on Scalise’s (2012) website. These question-types were designed and funded by partnerships such as the Smarter Balanced Assessment Consortium (SBAC) (2010) and the Partnership for Assessment of Readiness for College and Careers (PARCC) (2010). As progressive as they are, while striving to measure higher-level understanding, it could be argued that they were designed using an ‘instrumental’ and assessment-driven perspective which includes goals such as the development of a common assessment system that “will help make accountability policies better drivers of improvement” (p. 8). Politics are really hard to tease out of assessment… but I diverge…
    Whatever perspective or purpose you (the general “you”) may have, you may still feel it is important for students to engage in technologically dependent assessments to address the ‘reality’ of the future of their own testing experiences.
    Many others, such as Burkhardt (2012) would argue for a more problem-based approach to assessment, such as YOURS Dan ☺, or those of the Shell Centre (Shell Centre for Mathematical Education, 2012), since teachers DO teach to the test. Burkhardt suggested that well-designed assessment includes “short items and substantial performance tasks so that teachers who teach to the test, as most teachers will, are led to deliver a balanced curriculum that reflects the standards” (p. 2).
    I do not dismiss machine-scorable items when they are well designed. I would agree with a ‘balanced approach’ – that’s one of those statements that people say that makes it hard to disagree ;)

    References that are worth Checking Out!
    Anderson, L. W., & Krathwohl, D. R. (Eds.). (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s Taxonomy of Educational Objectives. New York, NY: Longman.

    Burkhardt, H. (2012, October 3). Engineering good math tests, Education Week.

    Partnership for Assessment of Readiness for College and Careers. (2010). PARCC: Application for the race to the top comprehensive assessment systems competition. Tallahassee, FL: Florida Department of Education.

    Scalise, K. (2012). Intermediate constraint taxonomy: Open source assessment objects, from

    Scalise, K., & Gifford, B. (2006). Computer-based assessment in e-learning: A framework for constructing “intermediate constraint” questions and tasks for technology platforms. The Journal of Technology, Learning, and Assessment, 4(6), 4-45. Retrieved from

    Shell Centre for Mathematical Education. (2012). Mathematics assessment project, from

    Smarter Balanced Assessment Consortium. (2010). Smarter balanced assessment consortium: Theory of action – an excerpt from the smarter balanced race to the top application. Olympia, WA: Office of Superintendent of Public Instruction.

  16. I completely agree that assessment drives instruction. Being a newer teacher at a high performing school, I was astonished by how much teaching revolves around test scores and that is the main focus of administration. I got into teaching to help better student education. I am excited about the new common core assessments and am in hope they drive better math instruction to replace the drill and kill I see every day. The more I learn about the new assessments, the more excited I get for changing the way math is taught. At the same time, I too am afraid of how the students will perform in the beginning years and what actions will be taken. Our students are definitely not prepared for what is about to come.

  17. These tasks make me wonder if 3 Acts lessons might be well suited for assessment in addition to sense-making or introduction of concepts. Any plans on doing that, Dan?

  18. A few of those videos seemed like they were straight out of your brain if you were an ELA teacher. Reminded me a lot of what you and Dave Major did.

  19. I did a recent session at Fall CUE that was a last minute addition, and sparsely attended discussing the SBAC sample questions on both Language Arts and Mathematics. This was a blessing as it allowed for a more seminar-type discussion of the subject.

    First, attendees brought up that there had been a state math test with constructed response questions (I believe it was CLASS) that predated the CST/CAP. New York has already used this.

    The small group I had liked this question (as do I), but I think the video is crappy, and more likely to distract (the swimmers finish in the animation not matching the time orders listed). My specturm-y students will perseverate on things like that and make mistakes–I have one student who tries to take a protractor to the triangles I draw freehand on the whiteboard to figure out the “missing” angle measure, rather than just subtracting. I vote thumbs-down on tech making a difference on this one.

    So, the question I had was is the tech worth it? The exercise I liked was Item 43051. Someone here may have commented on the calculator being confusing, but the thing that was interesting to me was that it would take a variety of answers in either fractional, decimal form, including un-reduced fractions. This is something that is not as easily handled on a paper/pencil test (although you could allow for multiple answers with a constructed response I suppose.)

    I’m not sanguine about scoring by machine or by a roomful of temps. Frankly, I look forward to more and better formative assessment with Common Core, not these tests or frankly any high-stakes assessment. The only thing I heard that was intriguing at my session was someone sharing the idea that only a sample of students maybe tested at any given time, but that could blow-up in our faces as much as NCLB did.

    I am happier about these questions than the current ones being asked on CST, and what SBAC came up with for ELA, and being able to adjust instruction to better prepare students for this change, but it will be an adjustment.

    Problems lurking out there are the always fun “math wars” a preview can be seen here:

    I am trying to blog more about the implementation, as my district is apparently at the forefront of the roll-out in our state, and I’m part of a “leadership” cadre that is developing new curricula for ELA. Thanks for allowing me to see what others are seeing, doing, etc.