Distraction: I (heart) Testing

“Test data is the educator’s report card, and we need to be held accountable to it, if only as a small attempt to rectify the imbalance between how poor teaching affects life outcomes of the teacher, versus how it affects life outcomes of the student.”

Dude is so on the money.

About 
I'm Dan and this is my blog. I'm a former high school math teacher and current head of teaching at Desmos. He / him. More here.

19 Comments

  1. If we’re talking about tests within the classroom, right on. If we’re talking about standardized tests, I call B.S.

    When students aren’t held accountable for test performance, why should they take it seriously? And if students don’t take it seriously, how can standardized test data be an indication of the quality of teaching going on in the classroom? Until students have consequences for high and low performance on standardized tests, those results only give, at best, a vague indication of what’s happening in the classroom and say very little about how good or poor a job a teacher is doing. Unless we’re holding teachers accountable for getting their kids to take a test seriously that has absolutely no impact on their lives.

    I’d like to have the kind of test data you’re talking about. I’d like to have that much faith in standardized tests. But I don’t have either of those things.

  2. Man, Todd, I’m gonna drop a post around here right soon that you’ll just hate. I hope you bring this kind of heat then.

    Anyway, for whatever it’s worth, back when merit pay was an issue in California, I was right alongside you. I’d still say the same: don’t tie my pay to the results of a test that’s of no account to the students who take it.

    But this idea, pitched by you and so many others, that since testing costs some cash and consumes some time and doesn’t deliver perfect results, we should dispose of testing entirely, has a really painful ring to me.

    From the linked post: “Not only do we not need less testing, we may in fact need more testing, and we certainly need better testing.”

    I can get behind better testing. And that would incl. your beef that there aren’t any consequences for non-performance.

    Until that’s resolved, though, I’m just going to offer a grade-level increase for any student who scores advanced. Because these results aren’t useless to me.

  3. We are talking about the CSTs on the STAR test, right? You’re going to offer a grade-level increase when? When you get those results in September of the following school year? And how can you say that they aren’t useless to you when they don’t drive your instruction for the current year’s kids? Are you writing about the standardized tests that actually exist or the ones you hope for? Is something different in math, then? ‘Cause I don’t see those results until September of the following school year.

    I don’t pitch the ideas that you’re suggesting I do. I don’t care if testing costs cash and consumes time, but by your own ratio, that time damn well better be properly spent; the cash, too. Do you really think that’s the case currently?

    I advocate for better testing, perhaps more frequent testing once that better test exists, and for the test to mean something to the students. Until those things happen, though, I can’t judge a student’s grade by a single test that he may or may not have taken in all seriousness given the other assignments on his plate. I’m shocked that you can.

    Did you really mean that you’d raise a student’s grade just because of one test taken in April? Would you also lower the grade of students who perform poorly? I’d respect your consistency if you did that.

  4. I get the results in September and then I’ll file a few grade change forms with the registrar. And I hate to disappoint, but I never make this particular incentive punitive. The kid who scores Below Basic keeps her grade, which was most likely low anyway. To score advanced on the STAR means a student understood the concepts how I taught them and was then able to transmute and synthesize them into the form given on the STAR test.

    I feel completely justified in raising a students’ grade for hitting the second-highest marker on Bloom’s taxonomy. I don’t, however, feel like I should decrease a student’s grade for an inability to synthesize. So I don’t.

    Two quick ones:

    1. Let’s say I received my class’ results and found good marks in every strand but Quadratics, where my class absolutely tanked. This happened.  I realized I shortcut some of the harder concepts ’cause, damn, they’re difficult, unsexy, and un-fun to teach.  I stepped my game up the next year and watched scores increase.  So I’m clear, you wouldn’t argue this information is useless, would you?
    2. The CSTs and, especially, the released questions are a great benchmark for me, a great shield for anyone who would call my rigor and depth into question. I refer to them on a weekly basis, at least.

    Not rebuttals to anything in particular you’ve said. Just some backpatting for these assessments, which need a lot of work, but aren’t as inconsequential as you argue.

  5. If you admit that there’s a translation involved in learning a concept in your class and demonstrating it on the test, then that test data is flawed because it necessarily implies that your students successfully sythesized that content. It should simply test the student’s knowledge of the concept, as purely as possible. If they didn’t sythesize correctly, they didn’t do well on the test. But that doesn’t mean that you didn’t teach effectively, right? It just means that the kids couldn’t translate into STAR language. That’s exactly why the test data only means so much. It does mean something, but it ain’t golden.

    The primary reason I argue they are inconsequential is because they don’t mean anything to the students. On the original post you quoted, I mentioned that we failed as a school system to get kids motivated to take these tests, but I don’t feel like it’s a classroom teacher’s job to think of ways to make that test meaning something when it doesn’t. Again, kids who fail and kids who ace it, nothing happens to them. You seem to have come up with a way around that, but there could be some problems.

    As for grade changes in September of the following school year, what if that effects their course placement (they fail your class, ace the STAR, so earn a passing grade, but are repeating your course already because your grade change came too late, so now they transfer classes 4 or 5 weeks into the school year)? What about students you don’t see again (they transfered schools)? What about it circumventing the grade as a marker of student ability to perform their job as a student during the regular school year, not just during one testing phase of the second semester? That’s also a part of what you grade means, not just content knowledge. It’s an interesting idea and I might give it a shot with my juniors this year. I don’t particularly have a problem with it, now that I’ve thought about it all day, but I’m not entirely certain it’s legal to change a grade the following school year (you could get around it by selecting “handed in missing work” or something, but that’s a cheat). I’ll check with my administrators on that tomorrow.

    Are you telling me that your school gives you the test results from last year’s students? I only get my current roster’s test results, not the students I had last school year. I have to hunt those down and don’t get those until November or so, and only then after I ask and ask and ask and ask. And if my students are only being tested in April, that doesn’t help too much. That’s a long time to wait in order to find out if what I did was effective. And even worse to find out I was ineffective.

    Was it my teaching that caused the increased results on this year’s test data or did I just get lucky with a group of astute kids this year? Students tested in September AND in April might tell me something. That way, I have baseline data and can see the growth. Measuring progress against last year’s test results don’t help either; that’s an apple to orange comparison when the standards change (English/Language Arts has a set of standards for 9/10 and a set for 11/12).

  6. Todd,

    You’ve made the point here and elsewhere that the tests don’t matter for the kids. I think that’s demonstrably false. Performance on CST determine whether students progress to Algebra I in 8th grade, whether they need to repeat this class in 9th or head to Geometry, and so forth. They determine whether ELLs are eligible for reclassification, or not (and as we know, unclassified ELLs enrolled in SEI classes are NOT taking classes recognized as acceptable for ultimate matriculation at UC and CSU schools). At my school, they are a part of determining scheduling in one of 18 strands/ tracks/ readiness placements. Moreover, they affect the perception of quality that a school or district inheres, a perception that can have myriad effects throughout the hierarchy of schools.

    I think its an abdication of responsibility to suggest that as teachers, we are not responsible for communicating the importance of these exams. It’s part of the job description to foster excellence in all academic endeavors, even the ones we wish were better.

  7. So you both know, I’m speaking only of student performance in high schools. I don’t know if my ideology about all this holds out at the lower levels. I do not advocate for no testing, but for better and more meaningful testing.

    How can CST test data be part of planning scheduling? Those results aren’t received until August at best, September more typically. The schedule is usually set in place by the beginning of August, mid-August in a bad year. I don’t doubt you, TMAO, I just wonder how your school makes that possible. Do you have massive schedule changes once those test results come in after the school year has begun?

    TMAO, assuming that you’re right (I don’t know that CST results are used for reclassification in ELL; I haven’t seen that happen at my school, though I can’t say for sure), that’s a very small pool of students for whom the CST matters. What about high school students that are not ELLs and not enrolled in Algebra I?

    Communicating the importance of these exams is fine and I tell my students how these tests matter to the school and to me, but there is no importance for students (excepting those students you mention, again a small percentage of the overall population). That’s the point you are not getting. Everything we do to make these tests matter to students is a lie because those results do not impact their journey through high school or on to college.

    Dan, checking with my administrators and having a fairly heated argument over it (during which I took your side, by the way), no one among all department chairs thought it a good idea, several called it unethical, and the administrators believe it’s actually illegal to make grade changes for the reasons you suggest. Grade changes after the year is over for something that was never considered a part of your classroom grade opens doors for lots of parent complaints and doesn’t seem wholly honest. Do you mention this on your green sheet? That might take care of my honesty concern. Do you have any students that take advantage of this generosity and bank on doing well on the CSTs, blowing off your homework and tests in the meantime?

    Arguments that came up against the idea of letting CTS results impact grades: class grades are not a demonstration of student mastery of standards on a single test. They are a demonstration of student mastery of standards over time in your class. Grades are an indication of the type of student a person is during the 180 days of the school year, not the 3 days of STAR testing. Grades are not standards based. What of those students who catch wind and blow off your class yet somehow manage to score proficient? That D suddenly becomes a C or that F becomes a D? And to my idea of consistency I suggested last time, if you grade isn’t good enough compared to CST results (the student who scored proficient or advanced should get a better letter grade than the one you assigned), then your grades are also too good in some cases and the student’s grade should be appropriately lowered.

    We are now thinking of gift certificates and iPod giveaways to encourage strong performance on the CST this year. I’ll be among a crew of teachers going from class to class explaining what the CSTs are and what the results mean, showing classes average scores, comparing our scores to other schools in our district, showing individual students where they scored and what that communicates to the world about their skills.

    I don’t mind lying about the significance of these test results in order for students to try harder, but I want to call a spade a spade. It’s a lie because the results don’t mean anything to students unless they have an intrinsic reason to be broken up about doing poorly. If we can start using those test results to impact their lives, my students will be the first to know how they are being used and what they can do to be on the winning side of the bargain.

  8. Todd,

    ELLs are in excess of 25% of all students in California. Within a generation it’ll be… what? 50%? I don’t think that’s insignificant. Here’s your reclassification link, for what it’s worth: http://www.cde.ca.gov/sp/el/rd/

    CST results are available in early August. They tend not to be released until later, but they are there. I usually see my previous year’s results in mid-August, broken down by student. At my site (6-8) admin spend three pretty rough weeks scheduling, using CST, CELDT, local assessment and teacher reccomendation to craft a schedule. At your school do you not have differentiated course offerings? If not (at least partly) through CST performance, how do kids classify for these things? The kid in Honors English who rocks a FBB stays there? Likewise the kid in Trig or Calculus AB?

  9. Not every student in that 25-50% population of ELLs is being redesignated any given year. Carve out the percentage that would really be impacted by the CST results in their redesignation process and you have a much smaller chunk. I’ll ask the ELL chair about using CST results at my school.

    Admission to honors classes is by teacher recommendation and an entrance essay. In social science, AP courses are open enrollment; in English, it’s teacher recommendation; I’m not sure the process for science and math, but I believe it’s teacher recommendation and course history.

    How are you seeing that breakdown of test results so quickly? Where is “there” that you are getting those results from? At my school, teachers do not have access to test data until just about September. The system that holds all that is called the Cruncher and we only have access to this year’s students in the Cruncher and we can only access the Cruncher from our classroom. We have no idea how instruction impacted last year’s kids unless those results are hunted down, which I’ve done for the last 2 or 3 years. This is the first year our administration handed out those results to all staff members, but I don’t think we got it until late September, early October. And even then, very little training for making sense out of the data (I know what it means, but lots of other teachers do not).

    Kids classify for course offerings based on prerequisites; a few courses require teacher recommendation. My school offers lots of AP classes and that’s one of our selling points, the number of AP offerings we have. And out of all those AP classes, I’ve been told that we removed 1 student from an AP class who tanked on the CST. That’s the only way CST affects course placement and I believe that student was more of an example than a pattern. Out of 2500 students, 1 kid’s schedule was affected. Students in honors English classes are there because of teacher recommendation and an entrance essay, not CST results. We schedule those classes long before the test results come our way.

    Does that bring anything to light? The CST data doesn’t impact our kids at all. So how does performance on the CST tell us much of anything when there’s no incentive to take the test, no punishment for failure and no reward for success?

  10. My last school implemented something like STAR Bucks, or STAR points, or somesuch. Kids scored a certain number of points for jumping classifications (FBB -> BB) or for scoring Advanced. There were some rallies, some raffles, and individual teachers had currency exchanges for extra credit, bathroom passes, pencils, whatever.

    I don’t have hard data but I’m pretty sure it got a few kids interested in testing who wouldn’t have been. Considering the administrative and financial cost, however, I don’t think the return on investment was all that high.

    My grade inflation scheme fares a little better by that measure since it costs me neither time or money. I’ve vetted it through my administration who approved and cautioned me not to announce it until STAR season for the same reason you cite, that kids might sandbag until the STAR and then start praying. Frankly, I haven’t met a kid yet with that kind of risk addiction, but I play along.

    These case studies you bring up, though, they’re such outliers it’s hard for me to imagine them. Like the F student who pulls an Advanced, suckering a D out of the system? I anticipate all my A students to pull down Advanced or Proficient, results which require no action on my part. The most likely event is a couple of B students might score Advanced and, imo, get a deserved grade bump. If you’re attached to any of the other objections you list, most of which are philosophical disputes, I’d have some questions to ask you.

    Listen, though. The motivation for my grade incentive isn’t to reward students for doing well on the STAR. It’s my buy-in. It’s one of several ways I tell them that, hey, these tests matter to me, and I want them to matter to you.

    Nothing did more to improve my testing experience (both in terms of class scores and their accuracy) than when I decided to abandon my resentment and invest myself as much into testing as I had the rest of my practice.

  11. And now I’m on board.

    For the past 2 years, from the beginning of the year, I’ve been telling administrators that we need to start talking to the kids about STAR now; it’s never too early; let’s discuss the scores with parents at back-to-school-night; let’s mention that if we meet our API, we’ll revoke the no-hat policy, but let’s mention that now so students are thinking about it all year. It’s January and we’re just now getting started in pimping this test.

    We’re looking into gift certificates and an iPod giveaway, but you’re right about the ROI on that one. Plus, I don’t know how long we’ll be able to keep that up nor how long it will actually interest the kids. But if that’s all it takes to see a big leap in scores, what’s that say about test? The system that demanded we give these tests needs to come up with a way to hold the kids accountable for doing well (or you may choose to call that giving kids an incentive for strong performance; six of one, half a dozen of another). That message needs to be loud and clear.

    In the meantime, however, school sites need to get creative and figure out ways to let students know that it’s an important test for us and get those carrots out in front of the students. But then that means these tests are more a measure of how well we motivate our kids, not how well we teach them standards. And you (heart) these tests? To me, these are just another flaw in a horribly flawed system. Yeah, the tests are here and we have to make the best of them, but let’s not gloss over the fact that these tests don’t quite test what they should. If motivation equals better performance, the tests don’t measure skill; they measure motivation.

  12. Todd says, “But then that means these tests are more a measure of how well we motivate our kids, not how well we teach them standards.”

    Yeah, I can’t get down with you on that one at all. Sounds good, I suppose, but it jibes only with how my cynicism wants the world to be, not how it actually is.

    I put in ten minutes a week to show my kids some interesting multiple-choice problems. I talk positively about the CSTs. I suppress the cynicism in front of my kids. I know the days of kids bubbling doodles or profanity or their names into their scantrons are over for me.

    I agree that motivation is unfortunately factored into testing, but a majority measurement? That all depends on the teacher, I guess.

  13. The words you quoted from me represent exactly the way the world is as far as I can tell. At my school, the CSTs are used for nothing (apart from, perhaps, ELL redesignation). Without the school using the CSTs for something, they don’t matter to the students. Try assigning 10 pages of homework, but make sure your students know it won’t go down in the grade book and you won’t discuss it and their score on that assignment won’t be known until after you’ve filed the final grades for the school year. In fact, you won’t even look at the assignment until August and students can just drop it into a box you have waiting for them. What percentage of your students would even attempt the assignment, let alone spend the time needed to do a good job on it? If you collected that assignment later, do you really think it would give a good indication of how strong your teaching is and how skilled your students are? I don’t think it would. That’s the CSTs at my school.

  14. Your choice of hypothetical here leads me to believe we aren’t going to meet anywhere near halfway on this one, Todd. The CSTs aren’t a 10-page homework assignment. They’re an in-class essay at worst, since, like you say, there aren’t many consequences for failure and they don’t have to take it home.

    More realistically, they’re an in-class coloring assignment because you can do as half-assed a job as you want and still stay invisible. It’s a situation that’s eager for a small amount of give-a-damn to bring every student on board. We’re on diff. planets on that assertion, though.

  15. What’s the assertion that we’re on different planets about? I don’t follow.

    You’ve assessed my hypothetical perfectly. Yet you still believe that this coloring assignment can lead to reliable data about attainment of standards? You really think that all it takes is a teacher pep talk to get those kids to color seriously? Maybe that’s why we are so far off from each other on this. It takes more than words, to quote that thankfully long-lost 90s hair band.

    I believe that if students don’t have any tangible reason to do well on something, they won’t (except a small handful). I see evidence of that every day. So do you. Look at your grade book and see the amount of effort put in to work that *does* have meaning. Or maybe 100% of your students faithfully attempt every single assignment. I envy you if that’s the case. Do you honestly think the amount of effort is going to increase if we take that meaning away, as is the case for the CSTs at my school?

  16. Hey Dan (and Todd-san, of course…),

    Todd and I have had these discussions before, and many others besides. Dan, I have one question for you:

    I teach Japanese. Tell me in your own words how the CST’s reflect ANYTHING about the quality of my classroom instruction.

    Matt

  17. I’m obviously missing something here.

    A: Nothing. There aren’t any content standards for Japanese.

    (whispered to seatmate: Is the answer “Nothing”?)

  18. Robert the "Nerd"

    January 18, 2007 - 11:58 pm -

    Better Late Than NEVER????

    This discussion has centered around an assumption that is, in my view, flawed—the view that kids require sufficient external motivation to perform well on standardized tests. In other words, the tests have to “matter” to the kids, whatever that means. I disagree.

    For the past several years I have worked with very low performing students, who each year show marked improvements on their standardized tests (CST’s for Algebra 1). My perspective is formed by many hours of reading Alfie Kohn (Punished by Rewards, and No Contest) and I find the following to be true:

    One. Nothing is worse that being bored. Something to do is better than nothing, and the CST tests are something.

    Two. Kids will do what they feel they are prepared for and they feel they are able to do well. For example, kids like doing math and math stops sucking when it starts making sense. They don’t need to know how it will benefit them, and they don’t need to be bribed. They just need to feel good about their prospects for success and they will strive to succeed because it feels good. Likewise, tests are cool when they stop being a two-hour beatdown for kids, and the kids feel they might do well. It is our job to nurture their confidence and they will perform.

    I know this seems a bit polly-anna, but in my experience I have found it to be true.

    Also, connecting to comments seen in other parts of this blog I just want to say that merit pay is the worst idea EVER, even tough it seems unfair that some of us bust our asses while others sit on them. Merit pay will cause a wholesale flight from inner city/poor schools because transience is a huge problem (40% turn-over in a year) and poor kids present more challenges educationally than their wealthier counterparts—challenges that lie outside the sphere of control for the teacher yet adversely affect student performance.

    Eagerly awaiting a retort.