What’s Wrong with This Experiment?

If you’re the sort of person who helps students learn to design controlled experiments, you might offer them W. Stephen Wilson’s experiment in The Atlantic and ask for their critique.

First, Wilson’s hypothesis:

Wilson fears that students who depend on technology [calculators, specifically –dm] will fail to understand the importance of mathematical algorithms.

Next, Wilson’s experiment:

Wilson says he has some evidence for his claims. He gave his Calculus 3 college students a 10-question calculator-free arithmetic test (can you multiply 5.78 by 0.39 without pulling out your smartphone?) and divided the them into two groups: those who scored an eight or above on the test and those who didn’t. By the end of the course, Wilson compared the two groups with their performance on the final exam. Most students who scored in the top 25th percentile on the final also received an eight or above on the arithmetic test. Students at the bottom 25th percentile were twice as likely to score less than eight points on the arithmetic test, demonstrating much weaker computation skills when compared to other quartiles.

I trust my readers will supply the answer key in the comments.

BTW. I’m not saying there isn’t evidence that calculator use will inhibit a student’s understanding of mathematical algorithms, or that no such evidence will ever be found. I’m just saying this study isn’t that evidence.

Featured Tweet

Got one!

Featured Comment

Scott Farrand:

The most clarifying thing that I can recall being told about testing in mathematics came from a friend in that business: you’ll find a positive correlation between student performance on almost any two math tests. So don’t get too excited when it happens, and beware of using evidence of correlation on two tests as evidence for much.

I'm Dan and this is my blog. I'm a former high school math teacher and current head of teaching at Desmos. He / him. More here.


    • Randomly assign students within a class to groups of students who will use and not use calculators throughout the semester. Give a pre-test and a post-test to both. Use the pre-test and calculator use as covariates in your model so you control for what is uncontrolled in this study — prior achievement. If you find a significant interaction effect for pre-test and calculator use, dig around a bit. Is the proscription of calculator use helping students who scored high or low on the pre-test?

      Less preferable: look at two separate classes from the same instructor. Randomly assign one to calculator use and the other to not calculator use.

      The biggest problem with both of these is they require an honor code from students — a commitment that they’ll do what they’re supposed to do when they’re out of class. I’ll still take that flaw over Wilson’s.

  1. Michael Paul Goldenberg

    December 23, 2016 - 7:10 am -

    He certainly didn’t control for how much calculus understanding these students brought to the table, for one thing. If the low-performing students on the non-calculator test were already weak after 2 semesters of calculus, they were fairly doomed from the start. There’s a question of his Calc. 3 assessments: heavily computation-based with calculators disallowed? That would certainly skew the results in the “desired” direction. I always get nervous when I read about these sorts of ad hoc experiments done with an agenda.

  2. Dotty McClelland

    December 23, 2016 - 7:34 am -

    Correlation does not equal causation. There are very likely lurking variables that are affecting student performance other than just the use of the calculator. It is likely that students who can perform better on a no-calculator arithmetic test are also stronger math students.

  3. “So, how do we design a better study?”
    If you want to test how tech affects something you have to control for tech. This experiment just shows that students who are poor at arithmetic are also poor at calculus. Duh!

  4. On the face of it, the experiment says that student ability to perform arithmetic correlates positively with performance on a calculus final exam. I’m not seeing the connection to a conclusion about the value of calculators in the classroom. I suspect that the results on those tests would have been similar if the arithmetic test were given after the calculus final.

    The most clarifying thing that I can recall being told about testing in mathematics came from a friend in that business: you’ll find a positive correlation between student performance on almost any two math tests. So don’t get too excited when it happens, and beware of using evidence of correlation on two tests as evidence for much.

    • I totally agree! We use this test that my administration loves because it correlates with our students’ standardized test scores. I believe that a timed spelling test would also correlate with my students’ math test scores. And then my administrators would be telling me all about how I need to focus on spelling to raise our state test scores.

  5. Here is a similarly structured study:

    Professor W. separated his class into two groups, those with “excellent” dental hygiene and those with “poor” dental hygiene. Most students with strong results on the final exam had “excellent” hygiene, whereas most students with low results had “poor” hygiene.

    Inferences from this sort of study are left to the reader!

  6. So to paraphrase the previous comments with a slight spin, failing to “understand the importance of mathematical algorithms”, is not the same as doing poorly on a Calculus 3 final exam. I would have thought that Mr. Wilson would be able to see that fallacy. Indeed, I would argue that understanding the importance of mathematical algorithms and being able to multiply by hand are poorly correlated as well. There are well known examples of superb mathematicians who were very poor at calculations. To make Stephen Wolfram’s TED talk point, why would someone need to be good at calculations, when even your phone is better at it? Estimation, yes. Calculation, not so much. I myself am a competent mathematician who regularly sets the dinner table for the wrong number of people. On the Flanagan aptitude test in arithmetic in high school, I scored 4%. My math teacher and I had a good laugh at that one.

    • Spot on Paul. Computation is NOT mathematics. Mathematics is so much more and when mathematics people need help they turn to technology. Think ruler, compass, calculator, eraser, lined paper, etc.

  7. doesn’t this depend upon the context of the final exam. test takers are not “great at math”… referencing “Lockhart’s Lament” = perhaps the top 25% are just good at “following directions” and not good at math…
    As for the stats – it is very poor.. I can show a correlation between age and SAT scores, or SAT scores and height…

    Sorry – I was thinking today when I woke up… How can we possibly measure “what students learn and understand” – rather than knowing what we tested? In other words – getting away from Solving problems and “doing math” to “problem solving of the highest level – like the Moody’s or HiMCM… “

  8. I need more information about what the instructor really thought and what he was trying to understand or prove. I’ll probably need to read that article, and that probably wont be enough. I’m sure, though that I’ll agree with all of the comments above which seem to be good critiques of the situation. Right off the bat, I completely agree with the correlation of the high test scores being more likely due more to compounding factors.

    Another thing, is that the instructor should not look at the scores of the pre-test (or predictive test here) until the end of all grading. That way, he isn’t inclined to treat students differently during he semester.

    As a teacher, I would perhaps try to figure out more ways to teach those algorithms, if my desired outcome was to have more students master them.

    Last, I get concerned about critiquing work that I have very few details about. It would be better to engage in a conversation with the instructor to see what’s not included in the article.

  9. I have three observations, from *two* dramatically different perspectives.

    1) My Stewart pre-calc book has a whole chapter on polynomials: factoring, finding possible zeros, upper & lower bounds, Descartes Rule of Signs, graphing rational functions, and so on. I teach this, but recently a colleague asked me why, given that all these methods were developed for an earlier era. I have completely changed how I teach logs, particularly change of base, because I just don’t see the value. But to me, there’s still purpose in teaching this. But I haven’t had enough experience teaching pre-calc to make it something other than a lecture.

    2) I have not traditionally used calculators in the classroom much–never for tests. But I now require all my trig and precalc students to download Desmos (or use the URL), and encourage them to use it to check solutions, or to *see* solutions–particularly helpful for understanding how to find multiple solutions. In our school, we have an ideological divide between the math teachers who still teach kids how to use TI models and base their classes around it, and those of us who don’t use calculators much but use Desmos as a tool.

    3) I agree with what everyone is saying about the flawed methodology. But I can’t help by being struck by the test’s relevance to the life of a high school math teacher. We could give this test to kindergartners, and get a pretty decent correlation with Algebra 2 outcomes and all the occasional outliers of really struggling kids w/great math facts & great abstract thinkers w/poor calculation skills aside, abilities predict academic success. The Great Question of high school math is how to deal with the demands that we teach kids who aren’t terribly good at math who are forced to take math.

  10. I don’t know of many who say that calculator USE will inhibit understanding of procedures &/or performance in higher math skills. I do know of many (including myself) who believe, based on broad experience, that students who have come to RELY on calculators (which is something entirely different) for the most trivial tasks seem severely handicapped in higher processing and understanding of math. Whether that is causational or correlational is not clear but I’m prepared to hold to causation until shown otherwise — it seems the most sensible default hypothesis. A good experiment demonstrating causation is difficult to carry out cleanly, for reasons you outline here.

    I don’t tell my students not to use calculators in their work. I don’t really care. But I forbid them on exams, and they know this. So those who wish to survive my courses will understand that they cannot come to RELY on those calculators and must be prepared to work without them whenever necessary.

    In the majority of college service math courses across North America students are forbidden to use calculators on exams. Regardless of whether you believe this is a good or a bad thing, it is simply reality and you have no say in the matter. Consequently if you are training your high school students to RELY on calculators, you are setting them up for near-certain failure in post-secondary math.

  11. Dan–I appreciate the stats you provided about graphing calculators in calculus 1, but is there any more recent data of which you are aware? The study you referenced is 5+ years old, and my gut tells me that Desmos use has grown during that time.

    I teach middle school math and would like to push for GC usage but would love the data to either support or challenge that push.

  12. One of the problems with any study of this sort about the use of technology is that it does not have the larger picture in mind as to why we have students study and in some cases learn mathematics. In that case the role of technology is not understood in a context and perhaps it is abused by student AND faculty. Technology, in the modern sense, is just a tool, kind of like papyrus, clay table, pencil, paper, AND eraser(!), calculator, super computer, decimal value vs. Roman numerals, etc. How we elect to use the tool is really what is important. Learning a large number of algorithms is not really important any more I believe and my background is PhD mathematician who has taught at liberal arts college, major university, engineering schools, and military academies. I was taught how to extract square roots by hand in the 1950’s. Think about that as a useful algorithm. Do you believe doing that algorithm really taught me anything? It sure did not reinforce the concept of finding a number which when multiplied by self would give the number in question, nor did it give me any sense about that problem or its usefulness. We need to give students reasons to do mathematics and to use calculating devices by teaching mathematics in context at ALL stages of education. My son knew all his multiplication tables in grade school in Indiana, but when I asked him about finding how much tiling we would need for a 8 ft by 7 foot bathroom he added 8 and 7. He now has a PhD and knows this from experience, NOT from algorithm learning or table memorization. So stop fretting the particulars of calculator use and concentrate on why we are teaching mathematics in the first place. Then these issues will all fade away as we see that whatever tool we can get our hands on, if used correctly, to learn problem-solving, mathematics in context, and applications of quantitative reasoning is what is really important.

    Brian Winkel, Director SIMIODE, http://www.simiode.org – look it up to find out what it is!