Get Posts by E-mail

Archive for the 'tech contrarianism' Category

SRI's report on Khan Academy usage, released earlier this month, has the potential to make us all a lot wiser. They studied Khan Academy use at nine sites over two years, recording field notes, survey results, usage logs, and achievement measures, all well-specified in an 82-page implementation report and summarized in a shorter briefing. Their report has sharpened some of my concerns about the use of Khan Academy in math classrooms while blunting others.

First, there is irony to be found in SRI's reporting of usage rather than efficacy. The Gates Foundation underwrote the SRI report and while Gates endorses value-added models of quality for teachers it doesn't extend the same scrutiny towards its portfolio company here. After reading SRI's report, though, I'm convinced this exploratory study was the right study to run. SRI found enormous variation in Khan Academy use across the nine sites. We gain a great deal of insight through their study of that variation and we'd be much poorer had they chosen to study one model exclusively.

SRI found some results that are favorable to the work of Khan Academy. Other results are unfavorable and other results seem to contradict each other. You can find many of the favorable results summarized at Khan Academy's blog. I intend to summarize, instead, the concerns and questions the SRI report raises.

It isn't clear which students benefit from Khan Academy.

Over the two years of the study, 74% of teachers (63 teachers in SY 2011-12 and 60 teachers in SY 2012-13) said Khan Academy was "very effective" at meeting the learning needs of "students whose academic work is ahead of most students their age." Meanwhile, only 25% of teachers gave Khan Academy the same rating for students who are behind most students their age.

One teacher reports that "the same students who struggled in her classroom before the introduction of Khan Academy also struggled to make progress in Khan Academy." She continues to state that those students "were less engaged and less productive with their time on Khan Academy [than their peers]."

Participating teachers don't seem to have a great deal of hope that Khan Academy can close an achievement gap directly, though they seem to think it enhances the learning opportunities of advanced learners.

But that hypothesis is contradicted by the surveys from Site 1, a site which SRI states "had some of the highest test scores in the state [of California], even when compared with other advantaged districts." In question after question regarding Khan Academy's impact on student learning, Site 1 teachers issued a lower rating than the other less-advantaged sites in the study. For example, 21% of Site 1 teachers reported that Khan Academy had "no impact" on "students' learning and understanding of the material." 0% of the teachers from the less-advantaged sites shared that rating.

SRI writes: “Whatever the reason, teachers in sites other than Site 1 clearly found greater value in their use of Khan Academy to support their overall instruction.” SRI is strangely incurious about that reason. Until further revelation there, we should file this report alongside notices of Udacity's struggles in serving the needs of lower-achieving students in their pilot course with San Jose State University in 2013. Their struggles likely relate.

Khan Academy use is negatively associated with math interest.

I'm going to jump quickly to clarify that a) Khan Academy use was positively associated with anxiety reduction, self-concept, and self-efficacy, b) all of these non-achievement measures are measures of correlation, not causation, and c) the negative association with interest isn't statistically significant.

But I'm calling out this statistically-insignificant, non-causal negative association between Khan Academy and interest in math because that measure matters enormously to me (as someone who has a lot of interest in math) and its direction downward should concern us all. It's very possible to get very good at something while simultaneously wishing to have nothing to do with that thing ever again. We need to protect against that possibility.

Teachers don't use the videos.

While Khan Academy's videos get lots of views outside of formal school environments, "more than half the teachers in SY 2011-12 and nearly three-quarters in SY 2012-13 reported on the survey that they rarely or never used Khan Academy videos to support their instruction."

One teacher explains: "Kids like to get the interaction with me. Sal is great at explaining things, but you can’t stop and ask questions, which is something these kids thrive on."

Khan Academy seems to understand this and has recently tried to shift focus from its videos to its exercises. In a recent interview with EdSurge, Sal Khan explains this shift as a return to roots. "The original platform was a focus on interactive exercises," he says, "and the videos were a complement to that."

Elizabeth Slavitt, Khan Academy's math content lead, shifts focus in a similar direction. "For us, our goal isn’t necessarily that Khan introduces new concepts to students. We want to give practice."

Khan Academy is shifting its goal posts here, but we should all welcome that shift. In his speech for TED and his interview with 60 Minutes and my own experiences working with their implementation team, Khan Academy's expressed intent was for students to learn new concepts by watching the video lectures first. Only 10% of the teachers in SY 2012-13 agreed and said that "Khan Academy played a role in introducing new concepts." Khan Academy seems to have received this signal and has aligned their rhetoric to reflect reality.

The exercises are Khan Academy's core classroom feature, but teachers don't check to see how well students perform them.

73% of teachers in SY 2012-13 said "Khan Academy played its greatest role by providing students with practice opportunities." Over both years of the study, SRI found that 85% of all the time students spent on Khan Academy was spent on exercises.

Given this endorsement of exercises, SRI's strangest finding is that 59% of SY 2012-13 teachers checked Khan Academy reports on those exercises "once a month or less or not at all." If teachers find the exercises valuable but don't check to see how well students are performing them, what's their value? Students have a word for work their teachers assign and don't check. Are Khan Academy's exercises more than busywork?

SRI quotes one teacher who says the exercises are valuable as a self-assessment tool for students. Another teacher cites the immediate feedback students receive from the exercises as the "most important benefit of using Khan Academy." But at Site 2, SRI found "the teachers did not use the Khan Academy reports to monitor progress," electing instead to use their own assessments of student achievement.

SRI's report is remarkably incurious about this difference between the value teachers perceive of a) the exercises and b) the reports on the exercises, leaving me to speculate:

Students are working on individualized material, exercises that aren't above their level of expertise. They find out immediately how well they're doing so they get stuck less often on those exercises. This makes for less challenging classroom management for teachers. That's valuable. But in the same way that teachers prefer their own lectures to Khan's videos, they prefer their own assessments to Khan's reports.

One hypothesis here is that teachers are simply clinging to their tenured positions, refusing to yield way to the obvious superiority of computers. My alternative hypothesis is that teachers simply know better, that computers aren't a natural medium for lots of math, that teacher lectures and assessments have lots of advantages over Khan Academy's lectures and assessments. In particular, handwritten student work reveals much about student learning that Khan Academy's structured inputs and colored boxes conceal.

My hypothesis that teachers don't trust Khan Academy's assessment of student mastery is, of course, extremely easy to test. Just ask all the participating teachers something like, "When Khan Academy indicates a student has attained mastery on a given concept, how does your assessment of the student's mastery typically compare?"

Which it turns out SRI already did.


Unfortunately, SRI didn't report those results. At the time of this posting SRI hasn't returned my request for comment.


It isn't surprising to me that teachers would prefer their own lectures to Khan Academy's. Their lectures can be more conversational, more timely, and better tailored to their students' specific questions. I'm happy those videos exist for the sake of students who lack access to capable math teachers but that doesn't describe the majority of students in formal school environments.

I'm relieved, then, to read Elizabeth Slavitt's claim that Khan Academy doesn't intend any longer for its video lectures to introduce new concepts to students. Slavitt's statement dials down my anxiety about Khan Academy considerably.

SRI minimizes Khan Academy's maximal claims to a "world-class education," but Khan Academy clearly has a lot of potential as self-paced math practice software. It's troubling that so many teachers don't bother to check that software's results, but Khan Academy is well-resourced and lately they've expanded their pool of collaborators to include more math teachers, along with the Illustrative Mathematics team. Some of the resulting Common Core exercises are quite effective and I expect more fruits from that partnership in the future.

But math practice software is a crowded field and, for totally subjective reasons, not one that interests me all that much. I wish Khan Academy well but going forward I suspect I'll have as much to say about them as I do about Cognitive Tutor, TenMarks, ALEKS, ST Math, and others, which is to say, not all that much.

BTW. Read other smart takes on SRI's report:

I took machine-graded learning to task earlier this week for obscuring interesting student misconceptions. Kristen DiCerbo at Pearson's Research and Innovation Network picked up my post and argued I was too pessimistic about machine-graded systems, posing this scenario:

Students in the class are sitting at individual computers working through a game that introduces basic algebra courses. Ms. Reynolds looks at the alert on her tablet and sees four students with the “letters misconception” sign. She taps “work sample” and the tablet brings up their work on a problem. She notes that all four seem to be thinking that there are rules for determining which number a letter stands for in an algebraic expression. She taps the four of them on the shoulder and brings them over to a small table while bringing up a discussion prompt. She proceeds to walk them through discussion of examples that lead them to conclude the value of the letters change across problems and are not determined by rules like “c = 3 because c is the third letter of the alphabet.”

My guess is we're decades, not years, away from this kind of classroom. If it's possible at all. Three items in this scenario seem implausible:

  • That four students in a classroom might assume "c = 3 because c is the third letter of the alphabet." I taught Algebra for six years and never saw this conception of variables. (Okay, this isn't a big deal.)
  • That a teacher has the mental bandwidth to manage a classroom of thirty students and keep an eye on her iPad's Misconception Monitor. Not long ago I begged people on Twitter to tell me how they were using learning dashboards in the classroom. Everyone said they were too demanding. They used them at home for planning purposes. This isn't because teachers are incapable but because the job demands too much attention.
  • That the machine grading is that good. The system DiCerbo proposes is scanning and analyzing handwritten student work in real-time, weighing them against a database of misconceptions, and pairing those up with a scripted discussion. Like I said: decades, if ever.

This also means you have to anticipate all the misconceptions in advance, which is tough under the best of circumstances. Take Pennies. Even though I've taught it several times, I still couldn't anticipate all the interesting misconceptions.

The Desmos crew and I had students using smaller circles full of pennies to predict how many pennies fit in a 22-inch circle.


But I can see now we messed that up. We sent students straight from filling circles with pennies to plotting them and fitting a graph. We closed off some very interesting right and wrong ways to think about those circles of pennies.

Some examples from reader Karlene Steelman via e-mail:

They tried finding a pattern with the smaller circles that were given, they added up the 1 inch circle 22 times, they combine the 6, 5, 4, 3, 2, 1, and 1 circles to equal 22 inches, they figured out the area of several circles and set up proportions between the area and the number of pennies, etc. It was wonderful for them to discuss the merits and drawbacks of the different methods.

Adding the 1-inch circle 22 times! I never saw that coming. Our system closed off that path before students had the chance even to express their preference for it.

So everyone has a different, difficult job to do here, with different criteria for success. The measure of the machine-graded system is whether it makes those student ideas invisible or visible. The measure of the teacher is whether she knows what to do with them or not. Only the teacher's job is possible now.

Featured Comments

Sue Hellman:

This doesn’t even touch the students who get questions RIGHT for the wrong reasons.

Dave Major:

Dashboards of the traditional 'spawn of Satan & Clippy the Excel assistant' sort throw way too much extremely specific information straight to the surface for my liking (and brain). That information is almost always things that are easy for machines (read. programmers) to work out, and likely hard or time consuming yet dubiously useful for humans to do. I wonder how many teachers, when frozen in time mid-lesson and placed in the brain deli slicer would be thinking "Jimmy has 89% of this task correct and Sally has only highlighted four sentences on this page."

Can you help me shuffle my thoughts on teacher data dashboards?

The Current State of Teacher Data Dashboards

Generalizing from my own experience and from my reading, teacher data dashboards seem to suffer in three ways:

  • They confuse easy data with good data. It's easy to record and report the amount of time a student had a particular webpage open, for instance, but that number isn't indicative of all that much.
  • They aren't pedagogically useful. They'll tell you that a student got a question wrong or that the student spent seven minutes per problem but they won't tell you why or what to do next beyond "Tell the student to rewind the lecture video and really watch it this time."
  • They're overwhelming. If you've never managed a classroom with more than 30 students, if you're a newly-minted-MBA-turned-edtech-startup-CEO for instance, you might have the wrong idea about teachers and the demands on their time and attention. Teaching a classroom full of students isn't like sitting in front of a Bloomberg terminal with a latte. The same volume of statistics, histograms, and line graphs that might thrill a financial analyst with few other demands on her attention might overwhelm a teacher who's trying to ensure her students aren't setting their desks on fire.

If you have examples of dashboards that contradict me here, I'd love to see screenshots.

We Tried To Build A Better Data Dashboard

With the teacher dashboard on our pennies lesson, the Desmos team and I tried to fix those three problems.


We attempted to first do no harm.

We probably left some good data on the table, but at no point did we say, "Your student knows how to model with quadratic equations." That kind of knowledge is really difficult to autograde. We weren't going to risk assigning a false positive or a false negative to a student, so we left that assessment to the teacher.

We tailored the dashboard to the lesson.

We created filters that will be mostly useless for any other lesson we might design later.


We filtered students in ways we thought would lead to rich teacher-student interactions. For example:

  • If a student changed her pennies model (say from a linear to a quadratic or vice versa) we thought that was worth mentioning to a teacher.
  • We made it easy to find out which students filled up large circles with pennies and which students found some cheap and easy data by filling up a small circle.
  • We made it easy to find out which students had the closest initial guesses.

These filters don't design themselves. They require an understanding of pedagogy and a willingness to commit developer-hours to material that won't scale or see significant reuse outside of one lesson. That commitment is really, really uncommon for edtech startups. It's one reason why the math edublogosphere gets so swoony about Desmos.


Contrast that with filters from Khan Academy, which read, "Struggling," "Needs Practice," "Practiced," "Level One," "Level Two," and "Mastered." Broadly applicable, but generic.

We suggested teacher action.

For each of those filters, we gave teachers a brief suggestion for action. For students who changed models, we suggested teachers ask:

Why did you change your model? Why are you happy with your final choice instead of your first choice?

For students who filled up large circles, we suggested teachers say something like:

A lot of you filled small circles with pennies but these students filled large circles with pennies. That's harder and it's super useful to have a wide range of data when we go to fit our model.

For students who filled up small circles, we suggested teachers say something like:

Big data help us come up with a model, but so do small data. A zero-inch circle is really easy to draw and fill with circles so don't forget to collect it.

Even with this kind of concise, focused development, one teacher, Mike Bosma, still found our dashboard difficult to use in class:

While the students were working, I was mostly circulating around the classroom helping with technology issues (frozen browsers) and also clarifying what need to be done (my students did not read directions very well). I was hoping to be checking the dashboard as students went so I could help those students who were struggling. The data from the dashboard were helpful more so after the period for me. As I stated above, I was very busy during the period managing the technology and keeping students on track so I was not able to engage with what they were doing most of the time.

So we'd like to hear from you. Have you used the pennies task in class? Have you used the dashboard? What works? What doesn't? What would make a dashboard useful – actually usable – for you?

Featured Comments

Tom Woodward, arguing that these platforms are tougher to customize than the usual paper-and-pencil lesson plan:

The other piece I worry about is the relatively unattainable nature of some of the skills needed for building interesting/useful digital content for most teachers. I really want to provision content for teachers and then be able to give them access to changing/building their own content. While many are happy consuming what’s given, there are people who will want to make it their own or it will spark new ideas. I hate the idea that the next step would be out of reach of most of that subset.

And there’s Eric Scholz looking for exactly that kind of customization:

I would add a “bank” of variables at the top of the page that teachers from when building their lesson plan the night before. This would allow for a variety of objectives for the lesson.

Bob Lochel, being helpful:

While many adaptive systems propose to help students along the way, they are often mis-interpreted as summative assessments, through their similarities to traditional grading terms and mechanisms.

Tom Woodward, also being helpful:

There could/should be some value to a dashboard that guides formative synchronous action but it’d have to be really low on cognitive demand.

Posted without comment. (Comments tomorrow.)

A study published earlier this year on teacher data dashboards, summarized by Matthew Di Carlo:

Teachers in these meetings were quite candid in expressing their opinions about and experiences with Dashboard. One factor that arose with relative frequency was an expressed concern that the Benchmark tests lacked some validity because they often tested material the teachers had yet to cover in class. A second factor that was supported across the focus group discussions was a perceived lack of instructional time to act on information a teacher might gain from Dashboard data. In particular, teachers expressed frustration with the lack of time to re-teach topics and concepts to students that had been identified on Dashboard as in need of re-teaching. A third concern was a lack of training in how to use Dashboard effectively and efficiently. A fourth common barrier to Dashboard use cited by teachers was a lack of time for Dashboard-related data analysis.

Khan Academy intern Josh Netterfield, in June 2013, on Khan Academy's coach reports:

Currently over 70,000 teachers actively use KA in their classrooms, but few actually use coach reports. Already we’ve seen how the right kind of insights can transform classrooms, but some of the data has historically been quite difficult to navigate.

Stanford's 2011 analysis of Khan Academy [pdf]:

Generally speaking, the student data available on the Khan dashboard was impressive, but it also was challenging at times for the teacher to figure out how best to synthesize and use all the data – a key future needed if teachers are to maximize the potential of blended learning

Screenshots from a video of Khan Academy's recent redesign of their coach reports:


Dead On

Karen Head, on her "First-Year Composition 2.0″ MOOC:

Too often we found our pedagogical choices hindered by the course-delivery platform we were required to use, when we felt that the platform should serve the pedagogical requirements. Too many decisions about platform functionality seem to be arbitrary, or made by people who may be excellent programmers but, I suspect, have never been teachers.

Related: What Silicon Valley Gets Wrong About Math Education Again And Again

[via Jonathan Rees]

2013 Sep 18. Karen Head comments:

Just to remind everyone of the context of my statement. We asked that certain parameters in the coding be changed (like the one governing how much we could penalize students for not doing an assignment) and were given the answer that the penalty number was “hard coded” into the program. The tech support person couldn’t understand why it was a big deal to us. To be fair, I couldn’t be made to understand why it was a big deal to change the parameter from a fixed number of 20 to a range of 0-100, but I seem to remember from my basic undergrad programming class that it isn’t a big deal to do this. Of course, in the end, I’m just an English teacher. :-)

Next »