Fresh off our success decoding airline flight tables, I promised them we were going to crack the secrets of grocery stores wide open but to do so we’d need a lot of data. If we each sampled three data points, that’d suffice.

I gave them the weekend. I gave them a week’s worth of homework credit. I let them work with a partner.

33% of the class submitted data. That’s a pity. Even worse is the difference between the data I personally gathered last September (blue diamonds) and the noise my students submitted (pink squares) some of which was almost laughably fabricated. (As in, I *laughed* when I saw it.)

For perspective, I have one really exceptional outlier out of the thirty-six transactions I recorded: the person who took six minutes to purchase twenty items. It was a disastrous exchange featuring a price check and a ripped register tape. It was so bad that people happily fled *that* customer’s line for longer ones.

Out of my students’ thirty-two data points, they observed *six* transactions that were *even more* abnormal than that one, including one incredible checker who managed to ring up one hundred items in just eighty-one seconds.

Exactly one hundred items, right?

## 20 Comments

## Rob

March 26, 2010 - 3:39 pmOn the bright side, this seems like a perfect chance for a stats class to examine the evidence for whether Billy is faking his homework.

## josh g.

March 26, 2010 - 4:32 pmOn that topic, here’s a fun source:

“The Devil Is in the Digits: Evidence That Iran’s Election Was Rigged”

http://www.washingtonpost.com/wp-dyn/content/article/2009/06/20/AR2009062000004.html

## Rath

March 26, 2010 - 4:48 pmWhen my students do something like this… I can only assume that it’s because they assume everyone else is of the same intelligence or lower.

After all, they already know everything and can’t be taught anything more; especially from a MaTh TeAcHeR.

## Dave

March 26, 2010 - 6:09 pmI always find it really frustrating when I’ve spent time coming up with something interesting (rather than just textbook work) and my students are so apathetic. Eventually the frustration passes and we move on to something else that can be interesting.

I know that part of your assignment was to have students collect the data, but if you’re still interested in the data, we could all collect some and share the results.

I’m glad that I’m not the only who thinks about this every time I go to the grocery story.

Keep up the great work.

## Dan Meyer

March 26, 2010 - 7:02 pmYeah. They got a kick out of that.

## Andreas

March 27, 2010 - 4:44 amWhile there is a good chance the 100 items in 81 seconds is not a real observation, it made me think: what if somebody bought 100 candy bars, or folders, or other items sometimes sold in large quantities? Obviously the cashier would not scan each one individually.

## Michael Paul Goldenberg

March 27, 2010 - 4:48 amOf course, such behavior isn’t restricted to math class: when I taught lit & comp at U of Florida in the ’70s, the very first set of papers I received on the first literature assignment I gave (choice of DAY OF THE LOCUST or MISS LONELYHEARTS, both by the brilliant Nathanael West, contained three glaring cases of plagiarism out of the first five I looked at (a bit of a coincidence only in that two were from brothers, the third from a pal of theirs, and I assume they sat near one another so the papers were collected or turned in close to the same time. That they were at the top suggests they were also turned in last!)

This was all pre-‘Net, and there wasn’t a lot of critical literature published on West at the time, so it took me less than 30 minutes at the library to track down the sources from which these nit-wits had stolen. Did they really think I wouldn’t notice the rather sophisticated writing and content from college freshmen and sophomores who weren’t lit majors? I guess they did.

Never considered giving less interesting authors and books, of course. And neither will you stop giving good assignments. Sometimes you just have to kill ’em with kindness. ;)

## Marc Brown

March 27, 2010 - 7:12 amI have this problem every week! I am a 3-5 grade teacher of all subjects. It is so deflating when I tweak a lesson from WCYDWT or an article I read, present this great assignment and watch them complain. I feel it’s not their fault though, it’s the fact that they have never been asked to think and don’t understand what it looks like. My coach say’s”is this interesting to you or them?”

Keep eating that elephant !!

## Mr. K.

March 27, 2010 - 9:19 amI’m thinking there’s a WCYDWT aspect.

Collect the data. In fact, give everyone who didn’t do it a second chance to turn in something tomorrow.

Then, the next day, have them simulate coin flips. Give them a worksheet with two rows of, say, 50 boxes. For the first row, they’re supposed to enter a random series of heads and tails, that they get to make up out of their head. In the second row, they actually need to flip a coin to get the values.

Then, on the same worksheet, they make histograms for runs of heads/tails (i.e 3 heads in a row is a run of 3, 5 tails in a row is a run of 5).

Let them Gallery walk the results.

Then give them the same chart above, except with each students/teams data split out. Instead of having the do the regression you originally wanted, have them try to determine the set of rules for determining if people faked data rather than really collecting it. With luck, you’ll end up in the same place, but with a far cooler CSI type of motivation behind it.

## Mr. K.

March 27, 2010 - 9:53 amAddendum: coin flip reference.

## Dan Meyer

March 27, 2010 - 9:55 am“Who cheated?” is a really great WCYDWT question, leaning as it does on a student’s intuitive sense of right and wrong more than a student’s conrete knowledge of mathematics, which we build towards. “Develop rules for deciding who cheated” is a really excellent outcome of a WCYDWT activity.

Strong stuff.

## Derrick

March 27, 2010 - 7:50 pmDan,

Just started reading your blog a few weeks ago. I teach writing at a community college, but I still get inspired by your math. I’m curious about how you handled this situation in class. I encounter laziness/plagiarism all the time, and when it is a class-wide situation like this I like to make fun of my students (in an educational way). If this was my class I would project that chart up for everyone to see and mock the obvious cheaters for a good five minutes. Of course I have the luxury of teaching legal adults whose parents can’t come complaining.

## Dan Meyer

March 28, 2010 - 7:57 amIn this particular instance, I threw out all their data and we ran the regression with mine. I calculated the value of the linear activity to be greater than the value of whatever lesson in morality I could come up with.

I may try to have it both ways, though, and discuss the above graph in class this week.

## Steven Kimmi

March 28, 2010 - 6:25 pmI’m wondering if the few who turned in data really spent any time spying checkout lines. I’ve got think doing this assignment would bring about a whole bunch of odd feelings in both the students, the customers, and the cashiers. I’m wondering if you had a couple groups; one’s who said, “There’s no way I’m doing that!” and one’s who said, “Okay, meet me there…(at the store)this makes me feel kind of creepy” and the friend replies, “Yeah, everyone’s looking at us, let’s go home.”

While I love the lesson and the assignment, I’m wondering if the situation has too many stressors in it?

## Jane

March 29, 2010 - 7:48 amYes, the data is most likely fabricated. But, one of the frustrating things about dealing with real data is that sometimes there are outliers and you have to figure out what to do about them.

What to do about them is NOT a subject for anyone who has had less than two or three college level statistics or econometrics classes.

If you are going to teach linear regression…I’d suggest getting a well behaved data set and reuse it year after year. That way, you can focus on the lesson and you don’t have to deal with the data anomalies.

FYIW, 32 data is barely enough for a central limit theorem to kick in, assuming independence and normality. If the data points are not collected from different checkers, the independence assumption may not hold.

Yes, I have a Ph.d with a field in econometrics.

## Ben Wildeboer

March 29, 2010 - 11:53 am“Math gives your intuition a certain vocabulary” – Dan Meyer

The poorly faked data points to the fact that students don’t have that vocabulary. Students may not be as intimately familiar with the check-out process as us weekly grocery shoppers, but they know the drill. They should know the drill well enough to be able to “mime” out how long checking out 10 or 100 items takes (à la Dan the Record Breaker). Perhaps the value of the faked data isn’t so useful as an opening to outliers or linear regression- perhaps it’s a more valuable as an authentic assessment of students’ mathematical reasoning ability.

Personally, I disagree with

Janeand think youshouldtackle the topic of which are outliers. While statistically speaking it’s a tricky and complicated business, there is value in looking at data and being able to determine where something went wrong.I’d also bash ’em for faking data period. Not cool.

## Kevin Young

March 29, 2010 - 3:28 pmI really like the idea of the students gathering some data in the field, but apparently turning them loose was not sufficient. Maybe the problem could be solved by doing some in-class data collection training: some students act as shoppers, others act as cashiers that “scan” several classroom objects, the remaining students collect data in groups of two. Results are compared for consistency and accuracy and you discuss challenges of good data collection. However, it may need to go beyond training–you may need to make arrangements with a grocery store and turn it into a field trip of sorts. Another option would be to ask a store if you could set up a video camera or two and have the students collect the data from the video footage–that should prove the most accurate.

## Dan Meyer

March 29, 2010 - 3:39 pmCertainly. So we talked for several minutes about ways to mitigate that weirdness, all the way from making sure you introduce yourself to the cashier to covertly recording the transaction with your cell phone’s video camera and analyzing the data later.

Still, I won’t say I was surprised by the low submission count.

I like

Ben‘s note that students should be able to “mime” out the approximate times and I like howKevintakes it farther by having the students set up roles as checker and shopper and gather data on themselves. I imagine there’s a large degree of uncertainty here, where students who know they’re being timed will work faster but it couldn’t have hurt to model exactly the kind of data collection I was assigning more explicitly.## Meghan

March 30, 2010 - 8:55 amMaybe its because I work with college students, but I found that allowing the students to design and determine the experiment provided a lot more intrinsic incentive to gather good data. On the down side: I now know the average number of piercings the typical college student has…

## John Gonder

April 19, 2010 - 10:45 amGreat discussion – although I’ve been behind the one with 12 items taking 600 seconds – 300 to find the glasses, and 300 to make out the check. Obviously the answer is documentation – 3 iphone pics, time start, basket, time end – and a discussion of peer reviewed journal submissions process.

Keep up the good work –