Quantcast
Channel: Peter Berger Archives - VTDigger
Viewing all articles
Browse latest Browse all 125

Peter Berger: The fog of standards-based grading

$
0
0

Editor’s note: This commentary is by Peter Berger, an English teacher at Weathersfield School, who writes “Poor Elijah’s Almanack.” The column appears in several publications, including the Times Argus, the Rutland Herald and the Stowe Reporter.

Testing contractors always insist that their newest next-generation assessments represent a significant improvement over their previous newest assessments, which is why we should give them even more public school money than we’ve been giving them.

In the same way, experts who promote rubric-based scoring like to pretend that their latest benchmarks, descriptors and grids have overcome the inherent subjectivity and imprecision that have chronically plagued their earlier rubrics and the pseudo-objective numbers they spit out.

The problem is that rubrics still require scorers to discern whether a student’s writing has a “general purpose” or an “evident purpose.” Is the focus “maintained throughout,” or is it “strong”? Is it “intentionally organized” or “well-organized and coherent”?

You choose, and then tell me your score is data-worthy.

New England’s NECAP science assessment requires scorers to decide whether a student’s answer reflects a “thorough,” “general,” “limited” or “minimal” understanding. A “general” answer includes an entirely unspecified number of “errors and omissions,” while a “limited” response includes “several errors and omissions.” That’s a 25 percent scoring variation between “I don’t know how many” and “several.”

Can you count to “several”? I can’t.

New York’s rubrics split hairs between answers that “develop ideas clearly and fully” and those that “develop ideas clearly and consistently.” Try distinguishing between “precise and engaging” language and “fluent and original” language, “partial control” and “emerging control” of punctuation, or errors that “hinder comprehension” and those that “make comprehension difficult.”

Does the answer “establish a connection” or “an integral connection”? Does it employ “appropriate sentence patterns” or “effective sentence structure”? Are the details “in depth” or “elaborated”?

The SBAC Common Core assessment expects scorers to differentiate between “uneven, cursory support/evidence” and “minimal support/evidence.” Is the vocabulary “clearly appropriate” or “generally appropriate”? Are sources used “adequately” or “effectively”? Are there “frequent errors” that “may obscure the meaning” or “frequent and severe” errors that “often obscure” the meaning?

I’m glad we straightened all that out with statistical precision.

Assessment officials periodically double-check these rubric coin-flips, but even then a score is considered reliable as long as another rater gives the same sample an “adjacent” score. In other words, if I score a paper a 3.0, I’m right as long as a double-checker scores it either 4.0 or 2.0.

That’s the equivalent of saying that my hunch that I’m in Chicago is correct as long as the guy standing to me thinks we’re either in Albany or Omaha.

Inexplicably buoyed by standardized testing rubrics’ perennial failures, experts have unleashed rubrics on classrooms in the form of proficiencies and standards-based grading. Boosters argue that traditional letter grades and percentage numbers are meaningless. “What’s an 85?” they sniff.

That it means a student got 85 percent of the material right seems to escape them, as does the related understanding that 85 percent correct constitutes “good” work and thereby earns a B. They’re just as quick to scoff at A’s, C’s, D’s and F’s that outstanding, satisfactory, poor and failing work have customarily earned.

They prefer their 4-3-2-1 scale where work either “exceeds expectations,” “meets expectations,” is “making progress toward expectations,” or “does not meet expectations.” While they’re busy railing against letter grades’ purported imprecision, advocates fail to note that work which earns a 2 because it’s “making progress toward expectations” by definition also “does not meet” those expectations, and therefore should simultaneously earn a 1.

Along with its ambiguous scale, standards-based grading dissects a piece of work according to dozens of separate standards. That analytic approach to scoring has consistently proven less reliable than holistically assessing a piece’s overall quality and content. While standards-based grading claims to promote consensus, the more you subdivide assessment into distinct criteria, the less likely teachers are to award the same grade to the same work.

Analytic standards also ignore the reality that older students’ classes and assignments deal less with teaching and assessing discrete skills, and more with work that incorporates all their skills and knowledge.

The standards themselves are commonly borrowed from state and national blueprints like the Common Core. While some are commonsensical, like “spell correctly,” I doubt many parents are panting to discover how well their seventh-grader can “verify the preliminary determination of the meaning of a word or phrase,” or “select from simple/compound/complex sentences.”

I’m unsure how to objectively measure a student’s ability to “acknowledge information expressed by others.” Does his acknowledgment exceed the standard, meet the standard, or just demonstrate progress toward meeting the standard, which again should also mean it doesn’t meet the standard.

Social studies standards are similarly perplexing and dwell more on whether students can “solicit and respond to feedback” and “evaluate the credibility of differing accounts” than on how much they actually know about history. This ill-advised bias reflects reformers’ decades-long disdain for content knowledge, decades marked by a catastrophic decline in how much students know.

Standards are year-end objectives. It’s clearly meaningless to say that a first-grader is operating below expectations because he isn’t performing like an eighth-grader. It’s equally meaningless to score an eighth-grader’s writing substandard in October because it isn’t as good as it ought to be by June.

Standards-based systems commonly count only the three best or most recent assessments in each standard. This means a student who’s assigned 10 stories is assessed only on his scores on the last three. A student who can identify the Civil War’s causes and effects meets the overall “important events” history standard, even if he learned nothing when his class studied the Revolution. That hardly qualifies as a comprehensive assessment of reading achievement, or an accurate gauge of how much history he knows.

It’s certainly not an improvement over percentages and letter grades that reflect the entire body of a student’s work.

Compounding the absurdity, and the irony of a system that boasts “standards” in its name, in many standards-based systems, 1 is the lowest grade you can earn, even if you don’t hand anything in. Zeros are illegal. My school’s standards-based grading program won’t even let me type a zero.

I don’t claim to be perfectly scientific when I’m grading an essay. But the letters and numbers I used to give, despite their limitations, were more informative and translatable for parents and more reflective of my students’ work than the standards-based grades I’m required to give today.

Standards-based grading doesn’t foster learning. Its fog of details obscures achievement. It distracts us from our real problems.

Changing the way you measure a short board can’t make it longer. No assessment system can make our students achieve more.

Obscuring what they’re achieving certainly doesn’t help.

Read the story on VTDigger here: Peter Berger: The fog of standards-based grading.


Viewing all articles
Browse latest Browse all 125

Trending Articles