Category: tech contrarianism

Total 128 Posts

Tracy Zager Offers You And Your Fact Fluency Game Some Advice

Thoughtful elementary math educator Tracy Zager offers app developers some best practices for their fact fluency games:

I’ve been looking around since, and the big money math fact app world is enough to send me into despair. It’s almost all awful. As I looked at them, I noticed I use three baseline criteria, and I’m unwilling to compromise on any of them.

She later awards special merits to DreamBox Learning and Bunny Times.

What Students Do (And Don’t Do) In Khan Academy, Ctd.

My analysis of Khan Academy’s eighth-grade curriculum was viewed ~20,000 times over the last ten days. Several math practice web sites have asked me to perform a similar analysis on their own products. All of this gives me hope that my doctoral work may be interesting to people outside my small crowd at Stanford.

Two follow-up notes, including the simplest way Khan Academy can improve itself:

One. Several Khan Academy employees have commented on the analysis, both here and at Hacker News.

Justin Helps, a content specialist, confirmed one of my hypotheses about Khan Academy:

One contributor to the prevalence of numerical and multiple choice responses on KA is that those were the tools readily available to us when we began writing content. Our set of tools continues to grow, but it takes time for our relatively small content team to rewrite item sets to utilize those new tools.

But as another commenter pointed out, if the Smarter Balanced Assessment Consortium can make interesting computerized items, what’s stopping Khan Academy? Which team is the bottleneck: the software developers or the content specialists? (They’re hiring!)

Two. In my mind, Khan Academy could do one simple thing to improve itself several times over:

Ask questions that computers don’t grade.

A computer graded my responses to every single question in eighth grade.

That means I was never asked, “Why?” or “How do you know?” Those are seriously important questions but computers can’t grade them and Khan Academy didn’t ask them.

At one point, I was even asked how m and b (of y = mx + b fame) affected the slope and y-intercept of a graph. It’s a fine question, but there was no place for an answer because how would the computer know if I was right?

So if a Khan Academy student is linked to a coach, make a space for an answer. Send the student’s answer to the coach. Let the coach grade or ignore it. Don’t try to do any fancy natural language processing. Just send the response along. Let the human offer feedback where computers can’t. In fact, allow all the proficiency ratings to be overridden by human coaches.

Khan Academy does loads of A/B testing right? So A/B test this. See if teachers appreciate the clearer picture of what their students know or if they prefer the easier computerized assessment. I can see it going either way, though my own preference is clear.

What Students Do (And Don’t Do) In Khan Academy

tl;dr — Khan Academy claims alignment with the Common Core State Standards (CCSS) but an analysis of their eighth-grade year indicates that alignment is loose. 40% of Khan Academy exercises assessed the acts of calculating and solving whereas the Smarter Balanced Assessment Consortium’s assessment of the CCSS emphasized those acts in only 25% of their released items. 74% of Khan Academy’s exercises resulted in the production of either a number or a multiple-choice response, whereas those outputs accounted for only 25% of the SBAC assessment.

Introduction

My dissertation will examine the opportunities students have to learn math online. In order to say something about the current state of the art, I decided to complete Khan Academy’s eighth grade year and ask myself two specific questions about every exercise:

  • What am I asked to do? What are my verbs? Am I asked to solve, evaluate, calculate, analyze, or something else?
  • What do I produce? What is the end result of my work? Is my work summarized by a number, a multiple-choice response, a graph that I create, or something else?

I examined Khan Academy for several reasons. First, because they’re well-capitalized and they employ some of the best computer engineers in the world. They have the human resources to create some novel opportunities for students to learn math online. If they struggle, it is likely that other companies with equal or lesser human resources struggle also. I also examined Khan Academy because their exercise sets are publicly available online, without a login. This will energize our discussion here and make it easier for you to spotcheck my analysis.

My data collection took me three days and spanned 88 practice sets. You’re welcome to examine my data and critique my coding. In general, Khan Academy practice sets ask that you complete a certain number of exercises in a row before you’re allowed to move on. (Five, in most cases.) These exercises are randomly selected from a pool of item types. Different item types ask for different student work. Some item types ask for multiple kinds of student work. All of this is to say, you might conduct this exact same analysis and walk away with slightly different findings. I’ll present only the findings that I suspect will generalize.

After completing my analysis of Khan Academy’s exercises, I performed the same analysis on a set of 24 released questions from the Smarter Balanced Assessment Consortium’s test that will be administered this school year in 17 states.

Findings & Discussion

Khan Academy’s Verbs

141202_7lo

The largest casualty is argumentation. Out of the 402 exercises I completed, I could code only three of their prompts as “argue.” (You can find all them in “Pythagorean Theorem Proofs.”) This is far out of alignment with the Common Core State Standards, which has prioritized constructing and critiquing arguments as one of its eight practice standards that cross all of K-12 mathematics.

141202_1lo

Notably, 40% of Khan Academy’s eighth-grade exercises ask students to “calculate” or “solve.” These are important mathematical actions, certainly. But as with “argumentation,” I’ll demonstrate later that this emphasis is out of alignment with current national expectations for student math learning.

The most technologically advanced items were the 20% of Khan Academy’s exercises that asked students to “construct” an object. In these items, students were asked to create lines, tables, scatterplots, polygons, angles, and other mathematical structures using novel digital tools. Subjectively, these items were a welcome reprieve from the frequent calculating and solving, nearly all of which I performed with either my computer’s calculator or with Wolfram Alpha. (Also subjective: my favorite exercise asked me to construct a line.) These items also appeared frequently in the Geometry strand where students were asked to transform polygons.

141202_2lo

I was interested to find that the most common student action in Khan Academy’s eighth-grade year is “analyze.” Several examples follow.

141202_5lo

Khan Academy’s Productions

These questions of analysis are welcome but the end result of analysis can take many forms. If you think about instances in your life when you were asked to analyze, you might recall reports you’ve written or verbal summaries you’ve delivered. In Khan Academy, 92% of the analysis questions ended in a multiple-choice response. These multiple-choice items took different forms. In some cases, you could make only one choice. In others, you could make multiple choices. Regardless, we should ask ourselves if such structured responses are the most appropriate assessment of a student’s power of analysis.

Broadening our focus from the “analysis” items to the entire set of exercises reveals that 74% of the work students do in the eighth grade of Khan Academy results in either a number or a multiple-choice response. No other pair of outcomes comes close.

141202_8lo

Perhaps the biggest loss here is the fact that I constructed an equation exactly three times throughout my eighth grade year in Khan Academy. Here is one:

141202_6lo

This is troubling. In the sixth grade, students studying the Common Core State Standards make the transition from “Number and Operations” to “Expressions and Equations.” By ninth grade, the CCSS will ask those students to use equations in earnest, particularly in the Algebra, Functions, and Modeling domains. Students need preparation solving equations, of course, but if they haven’t spent ample time constructing equations also, those advanced domains will be inaccessible.

Smarter Balanced Verbs

The Smarter Balanced released items ask comparatively fewer “calculate” and “solve” items (they’re the least common verbs, in fact) and comparatively more “construct,” “analyze,” and “argue.”

141202_9lo

This lack of alignment is troubling. If one of Khan Academy’s goals is to prepare students for success in Common Core mathematics, they’re emphasizing the wrong set of skills.

Smarter Balanced Productions

Multiple-choice responses are also common in the Smarter Balanced assessment but the distribution of item types is broader. Students are asked to produce lots of different mathematical outputs including number lines, non-linear function graphs, probability spinners, corrections of student work, and other productions students won’t have seen in their work in Khan Academy.

141202_10lo

SBAC also allows for the production of free-response text while Khan Academy doesn’t. When SBAC asks students to “argue,” in a majority of cases, students express their answer by just writing an argument.

141202_11lo

This is quite unlike Khan Academy’s three “argue” prompts which produced either a) a multiple-choice response or b) the re-arrangement of the statements and reasons in a pre-filled two-column proof.

Limitations & Future Directions & Conclusion

This brief analysis has revealed that Khan Academy students are doing two primary kinds of work (analysis and calculating) and they’re expressing that work in two primary ways (as multiple-choice responses and as numbers). Meanwhile, the SBAC assessment of the CCSS emphasizes a different set of work and asks for more diverse expression of that work.

This is an important finding, if somewhat blunt. A much more comprehensive item analysis would be necessary to determine the nuanced and important differences between two problems that this analysis codes identically. Two separate “solving” problems that result in “a number,” for example, might be of very different value to a student depending on the equations being solved and whether or not a context was involved. This analysis is blind to those differences.

We should wonder why Khan Academy emphasizes this particular work. I have no inside knowledge of Khan Academy’s operations or vision. It’s possible this kind of work is a perfect realization of their vision for math education. Perhaps they are doing exactly what they set out to do.

I find it more likely that Khan Academy’s exercise set draws an accurate map of the strengths and weaknesses of education technology in 2014. Khan Academy asks students to solve and calculate so frequently, not because those are the mathematical actions mathematicians and math teachers value most, but because those problems are easy to assign with a computer in 2014. Khan Academy asks students to submit their work as a number or a multiple-choice response, not because those are the mathematical outputs mathematicians and math teachers value most, but because numbers and multiple-choice responses are easy for computers to grade in 2014.

This makes the limitations of Khan Academy’s exercises understandable but not excusable. Khan Academy is falling short of the goal of preparing students for success on assessments of the CCSS, but that’s setting the bar low. There are arguably other, more important goals than success on a standardized test. We’d like students to enjoy math class, to become flexible thinkers and capable future workers, to develop healthy conceptions of themselves as learners, and to look ahead to their next year of math class with something other than dread. Will instruction composed principally of selecting from multiple-choice responses and filling numbers into blanks achieve that goal? If your answer is no, as is mine, if that narrative sounds exceedingly grim to you also, it is up to you and me to pose a compelling counter-narrative for online math education, and then re-pose it over and over again.

A Response To The Founder Of Mathspace On The Costs And Benefits Of Adaptive Math Software

Mo Jebara, the founder of Mathspace, has responded to my concerns about adaptive math software in general and his in particular. Feel free to read his entire comment. I believe he has articulated several misconceptions about math education and about feedback that are prevalent in his field. I’ll excerpt those misconceptions and respond below.

Computer & Mouse v. Paper & Pencil

Jebara:

Just like learning Math requires persistence and struggle, so too is learning a new interface.

I think Mathspace has made a poor business decision to blame their user (the daughter of an earlier commenter) for misunderstanding their user interface. Business isn’t my business, though. I’ll note instead that adaptive math software here again requires students to learn a new language (computers) before they find out if they’re able to speak the language they’re trying to learn (math).

For example, here is a tutorial screen from software developed by Kenneth Tilton, a frequent commenter here who has requested feedback on his designs:

140921_1lo

Writing that same expression with paper and pencil instead is more intuitive by an order of magnitude. Paper and pencil is an interface that is omnipresent and easily learned, one that costs a bare fraction of the computer Mathspace’s interface requires, one that never needs to be plugged into a wall.

None of this means we should reject adaptive math software, especially not Mathspace, the interface of which allows handwriting. But these user interface issues pile high in the “cost” column, which means the software cannot skimp on the benefits.

Misunderstanding the Status Quo

Jebara:

Does a teacher have time to sit side by side with 30 students in a classroom for every math question they attempt?

[..]

But teachers can’t watch while every student completes 10,000 lines of Math on their way to failing Algebra.

[..]

I talk to teachers every single day and they are crying out for [instant feedback software].

Existing classroom practice has its own cost and benefit columns and Jebara makes the case that classroom costs are exorbitant.

Without adaptive feedback software, to hear Jebara tell it, students are wandering in the dark from problem to problem, completely uncertain if they’re doing anything right. Teachers are beleaguered and unsure how they’ll manage to review every student’s work on every assigned problem. Thirty different students will reveal thirty unique misconceptions for each one of thirty problems. That’s 27,000 unique responses teachers have to make in a 45 minute period. That’s ten responses per second! No wonder all these teachers are crying.

This is all Dickens-level bleak and misunderstands, I believe, the possible sources of feedback in a classroom.

There is the textbook’s answer key, of course. Some teachers make regular practice of posting all the answers in advance of an exercise set, also, so students have a sense that they’re heading in the right direction and focus on process not product.

Commenter Matt Bury also notes that a student’s classmates are a useful source of feedback. Since I recommended Classkick last week, several readers have tried it out in their classes. Amy Roediger writes about the feature that allows students to help other students:

… the best part was how my students embraced collaborating with each other. As the problems got progressively more challenging, they became more and more willing to pitch in and help each other.

All of these forms of feedback exist within their own webs of costs and benefits too, but the idea that without adaptive math software the teacher is the only source of feedback just isn’t accurate.

Immediate v. Delayed Feedback

Most companies in this space make the same set of assumptions:

  1. Any feedback is better than no feedback.
  2. Immediate feedback is better than delayed feedback.

Tilton has written here, “Feedback a day later is not feedback. Feedback is immediate.”

In fact, Kluger & DeNisi found in their meta-analysis of feedback interventions that feedback reduced performance in more than one third of studies. What evidence do we have that adaptive math software vendors offer students the right kind of feedback?

The immediate kind of feedback isn’t without complication either. With immediate feedback, we may find students trying answer after answer, looking for the red x change to a green check mark, learning little more than systematic guessing.

Immediate feedback risks underdeveloping a student’s own answer-checking capabilities also. If I get 37 as my answer to 14 + 22, immediate feedback doesn’t give me any time to reflect on my knowledge that the sum of two even numbers is always even and make the correction myself. Along those lines, Cope and Simmons found that restricting feedback in a Logo-style environment led to better discussions and higher-level problem-solving strategies.

What Computers Do To Interesting Exercises

Jebara:

Can you imagine a teacher trying to provide feedback on 30 hand-drawn probability trees on their iPad in Classkick?

[..]

Can you imagine a teacher trying to provide feedback on 30 responses for a Geometric reasoning problem letting students know where they haven’t shown enough of a proof?

I can’t imagine it, but not because that’s too much grading. I can’t imagine assigning those problems because I don’t think they’re worth a class’ limited time and I don’t think they do justice to the interesting concepts they represent.

Bluntly, they’re boring. They’re boring, but that isn’t because the team at Mathspace is unimaginative or hates fun or anything. They’re boring because a) computers have a difficult time assessing interesting problems, and b) interesting problems are expensive to create.

Please don’t think I mean “interesting” week-long project-based units or something. (The costs there are enormous also.) I mean interesting exercises:

Pick any candy that has multiple colors. Now pick two candies from its bag. Create a probability tree for the candies you see in front of you. Now trade your tree with five students. Guess what candy their tree represents and then compute their probabilities.

The students are working five exercises there. But you won’t find that exercise or exercises like it on Mathspace or any other adaptive math platform for a very long time because a) they’re very hard to assess algorithmically and b) they’re more expensive to create than the kind of problem Jebara has shown us above.

I’m thinking Classkick’s student-sharing feature could be very helpful here, though.

Summary

Jebara:

So why don’t we try and automate the parts that can be automated and build great tools like Classkick to deal with the parts that can’t be automated?

My answer is pretty boring:

Because the costs outweigh the benefits.

In 2014, the benefits of that automation (students can find out instantly if they’re right or wrong) are dwarfed by the costs (see above).

That said, I can envision a future in which I use Mathspace, or some other adaptive math software. Better technology will resolve some of the problems I have outlined here. Judicious teacher use will resolve others. Math practice is important.

My concerns are with the 2014 implementations of the idea of adaptive math software and not with the idea itself. So I’m glad that Jebara and his team are tinkering at the edges of what’s possible with those ideas and willing, also, to debate them with this community of math educators.

Featured Comment

Mercy – all of them. Just read the thread if you want to be smarter.

The Scary Side Of Immediate Feedback

Mathspace is a startup that offers both handwriting recognition and immediate feedback on math exercises. Their handwriting recognition is extremely impressive but their immediate feedback just scares me.

My fear isn’t restricted to Mathspace, of course, which is only one website offering immediate feedback out of many. But Mathspace hosts a demo video on their homepage and I think you should watch it. Then you can come back and tell me my fears are unfounded or tell me how we’re going to fix this.

Here’s the problem in three frames.

First, the student solves the equation and finds x = -48. Mathspace gives the student immediate feedback that her answer is wrong.

140827_1lo

The student then changes the sign with Mathspace’s scribble move.

140827_2lo

Mathspace then gives the student immediate feedback that her answer is now right.

140827_3lo

The student thinks she knows how to solve equations. The teacher’s dashboard says the student knows how to solve equations. But quiz the student just a little bit – as Erlwanger did a student named Benny under similar circumstances forty years ago – and you see just how superficial her knowledge of solving equations really is. She might just be swapping signs because that’s why her answers have been wrong in the past.

Everyone walks away feeling like a winner but everyone is losing and no one knows it. That’s the scary side of immediate feedback.

One possible solution.

When a student pulls a scribble move like that, throw a quick text input that asks, “Why did you change your answer?” The student who is just guessing will say something like, “Because it told me I was right.” Send that text along to the teacher to review. The solution is data that can’t be autograded, data that can’t receive immediate feedback, but better data just the same.

Related Awesome Quote

If you can both listen to children and accept their answers not as things to just be judged right or wrong but as pieces of information which may reveal what the child is thinking you will have taken a giant step towards becoming a master teacher rather than merely a disseminator of information.

JA Easley, Jr. & RE Zwoyer

Featured Comment

Justin Lanier:

I would want to emphasize that the issue is that Mathspace (and tech folks generally) tries to give immediate, “personalized” feedback in a fast, slick, cheap, low/no-labor kind of way. And, not surprising, ends up giving crappy feedback.

Daniel Tu-Hoa, a senior vice president at Mathspace responds:

[T]eachers can see every step a student writes, so they can, as you suggest, then go and ask the student: “why did you change your answer here?” For us, technology isn’t intended to replace the teacher, but to empower teachers by giving them access to better information to inform their teaching.

2014 Sep 4. I’ve illustrated here a false positive – the adaptive system incorrectly thinks the student understands mathematics. Fawn Nguyen illustrates another side of bad feedback: false negatives.