A Response To The Founder Of Mathspace On The Costs And Benefits Of Adaptive Math Software

Mo Jebara, the founder of Mathspace, has responded to my concerns about adaptive math software in general and his in particular. Feel free to read his entire comment. I believe he has articulated several misconceptions about math education and about feedback that are prevalent in his field. I’ll excerpt those misconceptions and respond below.

Computer & Mouse v. Paper & Pencil

Jebara:

Just like learning Math requires persistence and struggle, so too is learning a new interface.

I think Mathspace has made a poor business decision to blame their user (the daughter of an earlier commenter) for misunderstanding their user interface. Business isn’t my business, though. I’ll note instead that adaptive math software here again requires students to learn a new language (computers) before they find out if they’re able to speak the language they’re trying to learn (math).

For example, here is a tutorial screen from software developed by Kenneth Tilton, a frequent commenter here who has requested feedback on his designs:

140921_1lo

Writing that same expression with paper and pencil instead is more intuitive by an order of magnitude. Paper and pencil is an interface that is omnipresent and easily learned, one that costs a bare fraction of the computer Mathspace’s interface requires, one that never needs to be plugged into a wall.

None of this means we should reject adaptive math software, especially not Mathspace, the interface of which allows handwriting. But these user interface issues pile high in the “cost” column, which means the software cannot skimp on the benefits.

Misunderstanding the Status Quo

Jebara:

Does a teacher have time to sit side by side with 30 students in a classroom for every math question they attempt?

[..]

But teachers can’t watch while every student completes 10,000 lines of Math on their way to failing Algebra.

[..]

I talk to teachers every single day and they are crying out for [instant feedback software].

Existing classroom practice has its own cost and benefit columns and Jebara makes the case that classroom costs are exorbitant.

Without adaptive feedback software, to hear Jebara tell it, students are wandering in the dark from problem to problem, completely uncertain if they’re doing anything right. Teachers are beleaguered and unsure how they’ll manage to review every student’s work on every assigned problem. Thirty different students will reveal thirty unique misconceptions for each one of thirty problems. That’s 27,000 unique responses teachers have to make in a 45 minute period. That’s ten responses per second! No wonder all these teachers are crying.

This is all Dickens-level bleak and misunderstands, I believe, the possible sources of feedback in a classroom.

There is the textbook’s answer key, of course. Some teachers make regular practice of posting all the answers in advance of an exercise set, also, so students have a sense that they’re heading in the right direction and focus on process not product.

Commenter Matt Bury also notes that a student’s classmates are a useful source of feedback. Since I recommended Classkick last week, several readers have tried it out in their classes. Amy Roediger writes about the feature that allows students to help other students:

… the best part was how my students embraced collaborating with each other. As the problems got progressively more challenging, they became more and more willing to pitch in and help each other.

All of these forms of feedback exist within their own webs of costs and benefits too, but the idea that without adaptive math software the teacher is the only source of feedback just isn’t accurate.

Immediate v. Delayed Feedback

Most companies in this space make the same set of assumptions:

  1. Any feedback is better than no feedback.
  2. Immediate feedback is better than delayed feedback.

Tilton has written here, “Feedback a day later is not feedback. Feedback is immediate.”

In fact, Kluger & DeNisi found in their meta-analysis of feedback interventions that feedback reduced performance in more than one third of studies. What evidence do we have that adaptive math software vendors offer students the right kind of feedback?

The immediate kind of feedback isn’t without complication either. With immediate feedback, we may find students trying answer after answer, looking for the red x change to a green check mark, learning little more than systematic guessing.

Immediate feedback risks underdeveloping a student’s own answer-checking capabilities also. If I get 37 as my answer to 14 + 22, immediate feedback doesn’t give me any time to reflect on my knowledge that the sum of two even numbers is always even and make the correction myself. Along those lines, Cope and Simmons found that restricting feedback in a Logo-style environment led to better discussions and higher-level problem-solving strategies.

What Computers Do To Interesting Exercises

Jebara:

Can you imagine a teacher trying to provide feedback on 30 hand-drawn probability trees on their iPad in Classkick?

[..]

Can you imagine a teacher trying to provide feedback on 30 responses for a Geometric reasoning problem letting students know where they haven’t shown enough of a proof?

I can’t imagine it, but not because that’s too much grading. I can’t imagine assigning those problems because I don’t think they’re worth a class’ limited time and I don’t think they do justice to the interesting concepts they represent.

Bluntly, they’re boring. They’re boring, but that isn’t because the team at Mathspace is unimaginative or hates fun or anything. They’re boring because a) computers have a difficult time assessing interesting problems, and b) interesting problems are expensive to create.

Please don’t think I mean “interesting” week-long project-based units or something. (The costs there are enormous also.) I mean interesting exercises:

Pick any candy that has multiple colors. Now pick two candies from its bag. Create a probability tree for the candies you see in front of you. Now trade your tree with five students. Guess what candy their tree represents and then compute their probabilities.

The students are working five exercises there. But you won’t find that exercise or exercises like it on Mathspace or any other adaptive math platform for a very long time because a) they’re very hard to assess algorithmically and b) they’re more expensive to create than the kind of problem Jebara has shown us above.

I’m thinking Classkick’s student-sharing feature could be very helpful here, though.

Summary

Jebara:

So why don’t we try and automate the parts that can be automated and build great tools like Classkick to deal with the parts that can’t be automated?

My answer is pretty boring:

Because the costs outweigh the benefits.

In 2014, the benefits of that automation (students can find out instantly if they’re right or wrong) are dwarfed by the costs (see above).

That said, I can envision a future in which I use Mathspace, or some other adaptive math software. Better technology will resolve some of the problems I have outlined here. Judicious teacher use will resolve others. Math practice is important.

My concerns are with the 2014 implementations of the idea of adaptive math software and not with the idea itself. So I’m glad that Jebara and his team are tinkering at the edges of what’s possible with those ideas and willing, also, to debate them with this community of math educators.

Featured Comment

Mercy – all of them. Just read the thread if you want to be smarter.

About 

I’m Dan and this is my blog. I’m a former high school math teacher and current head of teaching at Desmos. More here.

36 Comments

  1. Incidentally, I have one student in my geometry class who went through algebra during the summer solely with adaptive software. She said she could ‘do it on computer’ but ‘not on paper’.

    I think there’s more going on here than the usual ‘rely too much on instant feedback’. She seemed to be literally accessing the work of algebra steps on paper like it was a different skill that didn’t transfer.

  2. Two things:
    1: Where does the “adaptive” come into this Mathspace?
    2: I have written many computer programs, with user interfaces, including a help-you-to-make sense-of-algebra package (details in my blog – software).
    I cannot for the life of me see how what is given as the input sequence could ever yield the expression.

  3. So much to comment on here, but I want to focus on the assertion that teachers are “crying out” for feedback software. This is part-true. My experience is that teachers will accept a software solution IF it provides a clear, efficient improvement over existing classroom practice. Unfortuntaely, few companies seem to understand this.

    In the past few years our district moved towards a hybrid learning model, and I had the opportunity to preview many canned math software solutions. There were certainly a lot of neat graphs to be seen, with red and green progress bars and helpful videos. But I haven’t seen a product yet which serves as an effecgtive complement to classroom practice. Instead, the products often work as stand-alones, and students sense the disconnect.

    For my classroom, I don’t need software to generate a whole bunch of problems for students to do…it just becomes more busy work. Rather, I’d like software which would allow me to register feedback as my students work. For example, if my students are all working at the board or on their desks, and I know there are common mistakes being made, could I click on their names on my ipad and register this? I don’t need more problems….I need better feedback tracking.

    thanks for fighting the good fight here.

  4. Not sure Mo was blaming the user when he said, as do I, yeah, there’s a learning curve. Please note that we say that and then, when something confuses our users, we do our best to fix it.

    As you noted, both MS and my app are under active development, it is quite easy for us to adapt to user input. (But you are right, that is not the usual case.)

    For example, that worst case example you selected from my math typing tutorial could be made a lot simpler if I add a cube root icon to the virtual keyboard (and a keyboard shortcut to that). I have seen one competitor who has that and thought it a good idea. I’ll add it to my do list, but the point of that exercise was to show how to enter something outside the ordinary, so I would just change the example to the fourth root. :)

    As for typing math being harder, yep. Otoh, I have ready to go code that would let a student type in an equation and then with one keystroke copy the whole thing to a new line with both sides in a numerator and blinking carets in both denominators ready for the student to divide on both sides. That trumps paper and pencil pretty well, and makes it easier for kids to work in small steps instead of trying to do it in their heads to avoid all that writing/typing.

    I hesitate on the above precisely because I want students succeeding with my app to succeed on paper (as well as on tests where no help is available) and an edit trick like that might be too much help.

    I think the point Mo and I are making is that kids today are failing Algebra at an extraordinary rate while enjoying the ease of pencil and paper. Maybe a little effort put into getting comfortable with math on a keyboard would be worth it if significantly better results can be had.

    I doubt we will agree on whether the learning curve is worth it since we disagree so enthusiastically on the value of feedback. We might be disagreeing on feedback because I sense we disagree on whether students should be fluent in what I will call “old skool” Algebra — the Algebra I learned back in the sixties. For that students need to do a lot of problems, and if they are doing a lot of problems and using the feedback software a lot the learning curve cost is rapidly amortized down to nothing.

    That said, studies I have seen on some software suggests they did not get heavy use, they were almost ancillary activities. Not sure how that could work, but then it also says to me the software was not much liked by teachers and/or students.

    The funny thing is, software like mine and MS offers a benefit to those of you who want to engage students with activities other than pure maths yet at the same time not cheat them by leaving them incompetent at pure maths: have them learn the pure maths on software in homework time and do activities/discussions in class.

    Speaking of discussions, kids explaining Algebra to other kids might be good for the explainer — the best way to learn is to teach — but I am not sure it is good for the struggling student: even trained teachers make the mistake of telling kids how to do things instead of making them think their way through to a solution.

    back to work…

  5. Thanks Dan for addressing the arguments in such clear, allusive detail.

    I think the argument that learners are not learning as much as we’d like because they aren’t using automated feedback software is making a lot of assumptions. In all the research I’ve read on elearning and computer assisted learning, one message that comes across more often than not is that the differences in effect sizes between experimental groups using media/software/modes/etc. vs. control groups are substantially smaller than the differences in effect sizes achieved between teachers’ choices of learning and teaching techniques, strategies, methods, and approaches.

    No software will make you a better teacher and no software will help learners to learn if it isn’t appropriately incorporated into learning and teaching practice. More often than not, in practice, in the real world, adaptive learning software gets used as a substitute for homework and classroom exercises (exercises are not tasks, see NB below), which are often little more than “busy work” given under the (radical behaviourist) belief that mindless repetition somehow improves learning. See Alfie Kohn’s article “Rethinking Homework”: http://www.alfiekohn.org/teaching/rethinkinghomework.htm

    From a more John Dewey-esque point of view, “learning by doing” is often misconstrued to mean that we learn from doing things without considering that we often perform actions and complete exercises mindlessly. Dewey and many more educators and researchers that came after him claimed that we learn not so much from doing but more from thinking about what we’ve done. There’s a growing body of research that indicates that the simple act of telling another person how we succeeded or failed at a task helps to develop our understanding of the skills and knowledge required by the task. Counter-intuitively, it actually helps the speaker/writer more than the listener/reader, as does the act of assessing and giving feedback about others’ work. In other words, *giving feedback* is more productive that receiving it.

    NB: I make a distinction between exercise and task in that with exercises, learners are simply following instructions and/or predetermined “rote” routines, whereas in a task, learners are have a higher degree of discretion on how they complete the task, i.e. they’re more like real world problems/tasks.

  6. @Matt: you are absolutely right about the research so far, but have you seen any research involving software that checks (and helps with) each step of the problem instead of just the answer?

    I was a math tutor and the fact is students usually have the most trouble with the first step, because that is where the new skill is needed to unlock a solution. Once that is sorted out they just have to mop up simplifying.

    This is part of why having the answer key when problems involve multiple steps is better than nothing but not as supportive as step-by-step help.

    The research I would refer everyone to is Bloom et al, 1984: http://www.ascd.org/ASCD/pdf/journals/ed_lead/el_198405_bloom.pdf Bloom mentions a two sigma improvement in students provided with individual tutors (or 2-3 students per tutor).

    A tutor working with a topic new to a student ceratinly provides step-by-step assistance. Later they ramp things up by not reacting until the student says they are done. [Aside: great idea! Surprised I never thought of that before.]

    Bloom’s article assumes we cannot provide individual tutoring to students because of the cost and addresses how the effect might be approximated. Mathspace and I are saying we can approximate private tutoring with software. Will we be as good as a human? No, but by how much will we miss? Even half of two-sigma would be wonderful. And we will be available to the student 24×7 for those willing to make the effort, thus I suspect do better than half.

    Given that some students try four or five times to pass Algebra before giving up (see esp. the comments on http://blogs.kcrw.com/whichwayla/2013/05/how-algebra-ruined-my-chances-of-getting-a-college-education) — the effort is there, I think, as long as they sense they have a fighting chance thanks to expert software that talks back to them making Algebra interactive.

  7. Bob Lochel:

    But I haven’t seen a product yet which serves as an effective complement to classroom practice. Instead, the products often work as stand-alones, and students sense the disconnect.

    Keen insight.

    Kenneth Tilton:

    Bloom’s article assumes we cannot provide individual tutoring to students because of the cost and addresses how the effect might be approximated. Mathspace and I are saying we can approximate private tutoring with software. Will we be as good as a human? No, but by how much will we miss? Even half of two-sigma would be wonderful. And we will be available to the student 24×7 for those willing to make the effort, thus I suspect do better than half.

    As we saw in Kluger & DeNisi, the scale here doesn’t stretch from 0 to positive infinity but from negative infinity to positive infinity with 1/3 of all feedback interventions taking up space in the negative interval. How have you made yourself sure you’re working in the positive interval?

  8. Quick note on immediate vs delayed feedback. Recalling off the top of my head, in a meta-analysis of studies found in whatsisname’s book ‘The Art and Science of Teaching’, the sweet spot for feedback was at the end of a written assessment. Quicker feedback was more effective up to that point. Feedback that was given immediately after each question nose-dived into being less effective.

    It’s not hard to imagine why. Immediate per-question feedback leads to the kind of guess-and-check response that shuts down deeper thinking through the problem. Simply waiting until the test is over resolves that, but gets that feedback to the student while the thought process is freshest in their mind.

  9. “How have you made yourself sure you’re working in the positive interval?”

    Well, tangentially, the results are there. This thing was sold in the early nineties. Reviews (teachers and periodicals) are excerpted in the landing page carousel. But everyone can dig up hot quotes.

    I like to say that I did not design my app, I just did whatever was technically tractable to emulate a successful private tutor (which I had been). And as Bloom et al demonstrated, those work to the order of a two sigma improvement (the mean of the tutored group doing better than 95% of the control).

    It is like VisiCalc, the original spreadsheet: they knew exactly what the thing had to do because they were just automating something already being done with pencil, eraser and a good memory (for the formulas for each cell).

    Not everyone in the feedback space is trying to emulate a private tutor. The first thing I look for is, do they check each step? Can they give an intelligent hint midway through a problem. If they offer video help, does the software know enough about what is going on to pick out the right video automatically. If an example is offered, does it replicate the entire problem or the step the student has reached (and does the problem isolate just what needs to be done at the next step?)

    There is nothing easy about achieving all that, but I like to ask, How can software help a student if it is not at least as smart as the student?

    The years of development required to create such a thing cost more time and money than reasonable people are willing to invest. But nothing less will produce interesting ed tech. Put another way, teaching is much harder than non-teachers realize. That tells us how much investment will be needed to create effective teaching software.

    Hell, ed tech incubators want folks to ship in three months. My math editor alone took that long and more. (Slick editing, btw, is another thing I look for, because kids will not long endure things like the Microsoft Equation editor.)

    I had a laugh when I looked at the video linked to from my landing page. It takes more than two minutes for me to get to the software! Instead I am talking about how I tutored and how we all learn best, period.

    If we can emulate a private Algebra tutor, then we can expect to approach the improvement Bloom’s students identified. Past results — all of which unfortunately reached me years after I ceased distribution, when teachers wrote to ask if the software could still be had — suggest we can.

  10. @josh Indeed, in my into video I model an indifferent student who at one point just throws variants of -15/3 at the system (to no avail because his problem is a failure to reverse the inequality) and before that made a careless error he could quickly correct. In my full in-person demo I like to wind up with two more careless errors the student then quickly corrects — but what will happen when they get to their actual assessment?

    That’s why my app also has a conventional ladder of missions aka levels to be passed. Those offer zero help and zero second chances on a problem, precisely to wean kids of the help.

    One nice thing about the immediate feedback is that students weak even in the arithmetic fundamentals can move on to Algebra and have some success — if and only if they get those number facts under control.

    Meanwhile, have you ever asked a sixteen year-old to do a sheet of simple arithmetic? :)

  11. @Kenneth, I think what matters most is that whatever feedback learners get, it’s appropriate for that learner’s needs at that time. For that to reliably be the case, who or whatever is deciding when and what feedback to give has to understand the learner’s thought processes going on at that particular moment. Sometimes the best option is to stand back and let her/him think it through for her/himself and sometimes the best option is to ask an appropriate question to encourage her/him to think along certain lines… or to simply ask why s/he arrived at that formula or answer.

    For the foreseeable future, computers will not be able to perform the incredible feat of mind reading in order to ascertain the appropriate feedback strategy to take. See: http://en.wikipedia.org/wiki/Mentalization

    The best computers can manage at the moment is low-level error highlighting (like spell-checking and grammar checking in MS Word). From what I understand, this kind of feedback, at best, only helps learners to get to the right answer in that instance. There’s no conclusive evidence that those corrections are carried forward to future problems/tasks, e.g. How often do you forget to spell the same words over and over again?

    The danger with low-level feedback is that it distracts learners from thinking more about the task/problem itself and making sense of how Math works and what gives it meaning and purpose to learners. In effect, low-level feedback can act as an inhibitor to learning rather than an aid. I think this may be why 1/3 of the effects were negative in some research.

    I hope this makes sense.

  12. FWIW, just checked and it wasn’t in Art and Science, but in Marzano’s “Classroom Instruction That Works”.

    I’m not sure why, if immediate feedback is significantly less effective at improving learning than slightly-delayed feedback, we would think that arithmetic is the exception to the rule? I mean, I’m all for kids knowing their number facts, but do we have evidence that hyper-immediate feedback somehow does work best even though it doesn’t everywhere else?

    (I can imagine a reason why it might, but I can imagine all kinds of things. Evidence would be great.)

  13. @josh: Wow, what a resource! I knew there was a reason I was still following this thread! Thanks.

    For others, here is the best link I found: http://katiedevine.files.wordpress.com/2011/12/classroom-instruction-that-works_pdf.pdf

    Josh wrote:
    “I’m not sure why, if immediate feedback is significantly less effective at improving learning than slightly-delayed feedback, we would think that arithmetic is the exception to the rule?”

    That is not what Marzano reported (on page 98), though that was my first impression as well. In the first chart he shows a small negative effect for right/wrong feedback.

    When I saw “right/wrong”, I mistakenly assumed the context was one in which the student got a second chance. Nope. That is the separate “repeat until correct”, which shows a big positive effect.

    I think this answers your follow-up concerns.

    [Aside: as for why instant correction without a second chance would have a negative effect, I will hazard a guess: it is a little depressing/disheartening to see those red X’s popping up with no second chance. I put it down to “broken will” dragging down the performance for the rest of the assessment (in tennis they call it “tanking”), and for my target audience of struggling students with weak number facts and low interest that is a huge problem.]

    Meanwhile, you just saved me a bit of work. I had been thinking of showing kids their global (or local or intra-class) rank but had had my doubts given that the strugglers would be at the bottom of the charts.

    Marzano says forget norm-based feedback (also p 98):
    “When feedback is norm-referenced, it informs students about where they stand in relationship to other students. This tells students nothing about their learning. Criterion-referenced feedback tells students where they stand relative to a specific target of knowledge or skill. In fact, research has consistently indicated that criterionreferenced feedback has a more powerful effect on student learning than normreferenced feedback”

    Thanks again for a great reference. It makes the point I think the Anti-Feedback Crowd misses: not all feedback is the same. One cannot beat MS or my app with the Bennyware stick, without showing that our apps work essentially like the Bennyware.

    I would then note that one person took Mathspace up on their expanded access offer and no one has actually looked at my software. Absent that, it would seem that the AFC is not the best group to turn to for information.

  14. @Matt: I think this is the weak link in your well-constructed argument: “For [the delivery of appropriate feedback] to reliably be the case, who or whatever is deciding when and what feedback to give has to understand the learner’s thought processes going on at that particular moment. ”

    That model of the teacher as the owner and dispenser of learning will *never* work. The student must initiate remediation (or original learning) or the exercise is robbed of exactly the energy it needs: the student’s innate desire to succeed.

    My bet is on the student working things out for themselves if they are provided with tools they can use to do so, such as instant correction, Socratic hints, solved similar examples, narrowly target videos automatically selected, a public forum where they can ask questions, and as a last resort their teacher.

    Yes, there will be students not into the work at all. This is where teachers like Dan come in, bringing math to life, giving students a reason to care, modelling for students the excitement and power of mathematics to make sense of the world.

  15. Kenneth, I think you looked at the wrong chart? On that same page, see Figure 8.4 (not 8.3).

    To quote the summary:
    “Notice that feedback immediately after a test item has a relatively low average effect size of .19, and providing students with feedback immediately after a test has the largest effect size (.72).”

  16. Also, sorry, but if you’re characterizing any of the teachers here (Dan included) as “anti-feedback” then you’re seriously missing the point.

  17. @Kenneth, re: “That model of the teacher as the owner and dispenser of learning will *never* work.” — You quoted from my comment, “For [the delivery of appropriate feedback] to reliably be the case, who or whatever is deciding when and what feedback to give has to understand the learner’s thought processes going on at that particular moment. ”

    Where in that or anywhere else did I say that the feedback could only come from a teacher? In fact, I deliberately wrote, “who or whatever” to leave this open to options, since in previous comments on this post and the related previous posts, I’ve emphatically recommended that the teacher shouldn’t be the only source of feedback and that in fact, when learners give each other feedback, i.e. the act of giving feedback, tends to result in higher learning gains than receiving it. What’s more, learners know each other, often better than their teachers do, and so are in a better position to perform the incredible feat of mentalising with each other.

    If we’re going to talk about Bloom’s (revised) Taxonomy, I think it should be revised again. There are 3 stages before the well known 6:

    9) synthesising/creating
    |
    8) criticising
    |
    7) analysing
    |
    6) applying
    |
    5) understanding
    |
    4) remembering
    |
    3) harmonising/aligning (with each other)
    |
    2) mentalising/mind-reading
    |
    1) connecting (with each other)

    As Lev Vygotsky noted, learning first occurs interpersonally during interactions with caregivers/teachers/peers, then it occurs intrapersonally within the learner. Computers cannot make that connection therefore, under a computers’ direction, a learner is unlikely to be engaged in any meaningful and memorable way.

    To put it another way, imagine that we can develop amazingly impressive adaptive learning algorithms. Computers operate at a human level that is orders of magnitude worse than a human who is severely autistic. How appropriate to learners’ needs at any particular moment during learning activities do you think computers can be?

  18. @josh: I looked at both charts. I think everyone should look at that entire document and just throw away CCSS! The big takeaway is that one has to read these things very carefully, know exactly what they are measuring and how well. The second chart *sounds* like it is talking about summative tests, no second chances, and no step-by-step checking. That would be a horse of a different color, would you agree?

    @matt: It is funny, I said “algebra” too loudly at my morning breakfast spot and ended up talking about my software with a stranger. Turns out he and his friend were learning German over breakfast on their iPads, at https://www.duolingo.com/ They loved it. He showed me all his German badges and his bank account of linguets (the web site currency).

    I think the moral is, some things lend themselves to automated learning and others do not. And I would restrict that even further: some *aspects* of mathematics are better mastered in a machine-assisted setting, others are def better as social/group activities.

    My app does have a built-in forum. I doubt it will get much use, but my new friend said DuoLingo has something similar where one attempts to tell stories in one’s new language and others react.

  19. @Dan, you mention the meta-analysis of Kluger & DeNisi, which found some forms of feedback which reduce performance. You ask, “What evidence do we have that adaptive math software vendors offer students the right kind of feedback?”

    I think we can actually answer that, rather than pose it as a question. In his Formative Assessment & Standards-Based Grading, Marzano summarizes this effect this way: “Kluger and DeNisi found that negative feedback has an ES [Effect Size] of negative 0.14. This translates into a predicted decrease in student achievement of 6 percentile points. In general, negative feedback is that which does not let students know how they can get better.”

    Unfortunately, my exposure to Kluger & DeNisi’s work is limited to the summary found in Marzano’s. But as far as I know, that last sentence should provide the lens for evaluating feedback from adaptive software: does it provide a way to tell students how they can get better? Or does it at least tell the teacher that he/she should target a small group of students and teach them in person how they can get better?

    If the software just says whether you got an answer right or wrong, the feedback may actually be doing harm. More realistically, if the software provides instruction in the form of a confusing hint or boring video that students tend not to engage with, then the software is functionally the same as a program that just tells you right/wrong, and again, its feedback is likely harmful. In these cases, it all comes down to whether the teacher can use the data stream to provide effective reteaching and error analysis. It can be done, and when it is done, its beneficial.

  20. Here’s why I suspect that, as some studies have shown, delayed feedback is more effective than immediate feedback:

    1. Spaced learning effect
    2. A good rule of thumb is that effective feedback increases task-involvement while reducing ego-involvement. As time passes, it becomes easier for a student to emotionally distance themselves from the assessment.

    I’m sorry that I don’t have citations on hand for my “as some studies have shown” but I can dig them up later.

  21. @Kenneth, Duolingo is primative compared to the language learning software I’ve seen and used. The two market leaders, Auralog’s Tell Me More and Rosetta Stone, have been evaluated and the results are very poor:

    “The most striking finding was severe participant attrition, which was likely due to a variety of technological problems as well as the lack of sufficient support for autonomous learning in the workplace. This lack of compliance with self-study suggests that despite the logistical ease of providing language learning software, more resource-intensive types of language training are more likely to be effective.”

    Nielson, K. B., “SELF-STUDY WITH LANGUAGE LEARNING SOFTWARE IN THE WORKPLACE: WHAT HAPPENS?”, Language Learning & Technology, October 2011, Volume 15, Number 3, pp. 110–129, http://llt.msu.edu/issues/october2011/nielson.pdf

  22. Let’s find out under which conditions and in what form feedback is most favorable. I would like to suggest that Kluger & DeNisi might not be the most up-to-date reviews (1998). Hattie and Timperley is more current (e.g. http://growthmindseteaz.org/files/Power_of_Feedback_JHattie.pdf). There probably are more. Problem with every review is the vast differences between underpinning studies. I see the discussion as false dichotomy: both teacher and computer feedback can exist together. Teacher feedback *can* do without computers but imo there are some useful affordances (that still need to be pinpointed better). Computer feedback *can’t * do without an appropriate ‘frame’. Sure, students could use software with feedback on their own but it needs to be set in a teaching progression.

  23. It occurs to me that arguing from research has the same problem as arguing from analogy. With the latter, we first must prove that the thing being debated works the same as the analogue.

    With research, one must first establish that the things being tested are the same and the conditions are the same. eg, The language study cited by @Matt was about language study in the workplace (hunh?) and noted only lack of use by unmotivated users. Sounds like the software never stood a chance and the problem was political.

    And even if one finds what seems to be research mirroring the right conditions, debaters are free to draw different lessons from the outcome; we are inclined to find in the research confirmation of what we already believe.

    @Chris: thanks for the link to the Hattie research. Duly bookmarked.

  24. In that case, it’s probably worth pointing out that in the real world there will always be a multitude of differing influences affecting each learner’s performance and it’s impossible to single out one influence as entirely responsible; a point that I’ve heard Hattie emphatically make himself on several occasions (BTW, thanks to everyone for the links!).

    So the best we can manage in the real world are correlations among a variety of influential conditions and activities which in many cases are interdependent. In other words, feedback can only be effective if certain other conditions are met.

    In addition, effect sizes naturally correlate with learners’ initial state of performance. It’s much easier to make substantial learning gains (with minimal interventions) with learners who were previously poorly supported than learners who are already working in optimum conditions (AKA “chasing the long tail”).

    The role of anecdotal evidence in research is important but should never be taken as conclusive, i.e. don’t base your learning and teaching practice on them. Anecdotes serve as interesting insights that may or may not warrant more thorough and systematic investigation, i.e. research. I certainly wouldn’t base a business model on them.

    For your amusement: “Correlation isn’t causation” http://www.fastcodesign.com/3030529/infographic-of-the-day/hilarious-graphs-prove-that-correlation-isnt-causation

  25. I love this discussion; it really lets me get my tech on. Unfortunately, doing so is a good way to miss the forest for the trees, or whatever your favorite idiom might be.

    Here’s what I mean. The problem is more or less simple, “How do I create a means of providing instant feedback for the student?” Or is it? As soon as I ask that question, my mind immediately travels to the nuts and bolts. What is the algorithm? What is the most effective feedback? How does guessing fit in to the feedback loop?

    Whoops, forgot something, didn’t we? Isn’t the real question “How do I foster the student’s skill and power in mathematical reasoning?” The better question leaves room for better answers. Eventually, you come upon the best answer, which I am pretty sure is that the student learns to create meaning and build understanding through working with the problem, not by watching the red x turn green. When a student experiences this “Aha!” moment, the rewards are instant and lasting.

    Still, the operative question it whether to treat ourselves like Pavlov and Skinner, or like Vygotzky and Bruner. I don’t know about you, but I want to see my children and my students grow in math confidence, and to be able to explain their process. Then, if they get the answer wrong, I don’t care! They have gotten the meaning right.

  26. re Ron Fischman
    “Then, if they get the answer wrong, I don’t care! They have gotten the meaning right.”
    I just hope they don’t become engineers !

    On a more serious matter, I have spent several years (not all the time) developing an algebra program (not “Computer Algebra”) which has the following:
    1: Input as text, editable display in algebraic form
    2: Numerous algebraic operations
    3: Direct, meaningful feedback, explaining why an operation is not going to work, in terms of the algebraic structure, and
    4: Useful evaluation and graphing features.

    Details via the Software page on my wordpress site.

    I would love someone to have a look at it to see if it meets ANY of the desirable aims of “instant feedback” that apparently not possessed by other software.
    It is free !

  27. Hey, @Howard. Nice to meet another algebra app developer. We should form an association!

    I downloaded the algebra app and made some headway but got stuck on the “Solve” option. I picked “subtract from both sides”, it prompted me for a term, I clicked on a term, then nothing happened.

    Shoot me a note at ken at tiltontec dot com if you want to give me a hint on that.

  28. This post offered incredibly insight into adaptive math software and the pros and cons of using it in my classroom. I often struggle with whether or not to utilize adaptive math software in my classroom and if so, how much of the time. I believe that software shouldn’t take the place of regular classroom time filled with hands-on activities and paper-based problems. If anything, I want to use software as a supplement to my normal class. I think we run into many dangers as teachers when we allow the computer to be the teacher.

    I love when you said, “With immediate feedback, we may find students trying answer after answer, looking for the red x change to a green check mark, learning little more than systematic guessing.” I have found this result whenever I have used adaptive math software in the classroom. Many of my students, particular students who typically struggle with math, go through the motions and end up guessing answers until they get it right. With no human body looking over them or helping them step-by-step through a problem, it becomes a guessing game. This fact only hurts the student’s learning experience and creates more work for the teacher who needs to go back and ensure that students actually understand the concepts.

    I also agree with many others on this comment thread that I don’t need more problems as a teacher, I need strong feedback effectiveness, which comes through additional training and resources. Generating a multitude of problems for my students to practice is never an issue and takes very little time with all the resources at hand. The issue is giving strong and consistent feedback to all students in a way that will help them learn and progress. Yes, I would love instant feedback, but it is more important to me that the feedback has quality. If adaptive software could improve the quality of my feedback, I would be all over it. However, that isn’t something I have seen or experienced so far. I feel like I end up cleaning up the mess after my students use adaptive software. Time after time I am explaining the concepts and steps of the problem that they completely skipped when doing problems on adaptive software.

    If anyone has insight into more effective feedback systems or software, I am all ears.

  29. @Hannah: first, a point of information: not all automated math software is adaptive. I recently threw in an adaptive option in a newer practice area still under development, but I did that just to avoid resistance from folks who mistakenly think the computer should decide when the student should step up their game. I know from experience that the student should make that decision, which is why even now in this upcoming module /the student/ chooses between easy, average, hard, or adaptive.

    On the whole, there is nothing ‘adaptive” about my application. But I suspect you were using ‘adaptive” in a casual sense and just meant “automated”. That’s fine.

    If you really want “more effective..software”, you and everyone else in the burgeoning Anti-Feedback Group need to accept one simple idea: different products work different ways.

    Happy exploring! Or not. :)

  30. @Hannah

    “If adaptive software could improve the quality of my feedback, I would be all over it.”

    I’m very curious to know more on your dream classroom software:
    What professional development or non-software resources have improved the quality of your feedback up to this point?
    Would it be useful to see a list of potential misconceptions a particular student may hold? Conversation starters?

  31. FYI, Khan Academy’s new iPad app has handwriting recognition now. I haven’t tried it myself, but it looks pretty good in this video:

  32. I think they are using the same third-party math recognition library as MathSpace.

    Results are indeed quite impressive, though just now it took a while to get a “q” (that’s the letter after “p”) recognized as such instead of 9. But I am an old guy not even accustomed to tablets, and only played for about two minutes. Use it heads down for an hour and I imagine it gets dead easy.

    Unfortunately I am maybe 20% along in their Algebra I and I had to multiply a binomial times a trinomial. I punted.