Network News

X My Profile
View More Activity

Posted at 5:00 AM ET, 01/ 6/2011

A principal on standardized vs. teacher-written tests

By Valerie Strauss

My guest is George Wood, principal of Federal Hocking High School in Stewart, Ohio, and executive director of the non-profit Forum for Education and Democracy, a collaboration of educators from around the country.

By George Wood
In a recent Washington Post piece, Education Secretary Arne Duncan repeated a refrain we have heard far too often about the subject of testing.

In a nutshell, Duncan admits that No Child Left Behind and its reliance on standardized, fill-in-the-bubble, multiple guess tests both dumbs down and narrows instruction. Surprise, surprise.

The secretary goes on to promise that new consortiums working on assessments will produce “a new test” he claims will “measure what children know across the full range of college and career-ready standards, and measures other skills, such as critical-thinking ability.”

Allow me to express a bit of doubt. For starters, I hope he doesn’t define “new test” as “one test” because that will never accomplish what he claims he wants: an assessment that measures a broad spectrum of student abilities. Further, unless these new tests are uncoupled from the high stakes they currently invoke—such as punishments for schools and teachers—they will be just another standardized, easily scored exam that tell us little about what is really going on in our classrooms.

I am thinking about all of this today because it is the first of two days of our semester performance assessments at our school. At the end of each semester, teachers engage students in extensive, half-day performances of what they have learned. The idea is to have them show what they know through performance rather than filling in bubbles or choosing the right answer from a list of choices.

Here are several examples:

We give our American Government students two opinion pieces that take opposing view points on the recently passed health-care legislation. They are first asked to read the pieces using a literacy strategy called text-marking, and they are assessed on how well they have read. Then, using resources they accessed in class and information they glean from work in the media center, they are asked to draft their own position paper and submit it as an opinion piece for the local newspaper. Finally, they will engage in a Socratic Seminar discussing both the pieces they read and their own writing. A team of teachers using the rubrics students have used all semester long will evaluate each piece of the work.

Chemistry students begin class with a paper-and-pencil activity in which they balance chemical equations and solve for missing quantities. They then move to the lab where they will find a station with the requisite materials to conduct an experiment. The task: complete the experiment and produce a full lab report that includes the hypothesis tested, the results gained, and implications for further research.

And in physical education our students spend the first part of the period using a web site to determine their metabolic levels and basic data. Then they select a candy bar from their teacher’s desk. Now, the kicker: they are asked to design and then perform a 60-minute exercise program based on what they have learned this semester that will work off those calories!

I could go on, but you get the drift.

We are not perfect at this work. That’s why we take time after the assessment days are over to review how the assessments went and, most important, what we learned about our students’ abilities. What we find impacts our teaching next time around.

Compare this kind of assessment to our students’ experience with the Ohio Graduation Test, which is used for both state and federal accountability reporting. Each test -- reading, math, writing, science and social studies -- takes the same amount of time. But rather than ask students to do something with what they know, these tests ask them to regurgitate what they have heard. In fairness, there is some writing on the tests, but most of it still involves selecting the right answer from a list of givens.

Add to this shallowness the fact that our faculty never gets to see full student results or the writing samples and how they were graded. It does little to inform teaching except to let teachers know they should spend more time on The Boxer Rebellion, photosynthesis, or two-step equations, or similar inferences based on aggregate scores of the entire class.

Duncan applauds No Child Left Behind for disaggregating data—but bad data disaggregated is still bad data.

And let’s be honest. All this so-called “data-driven decision-making” talk should really be called what it is: test-driven decision making. Ohio’s school report cards consist of 26 “data” points, and 24 of them—92%--are test scores.

By the end of this week we will have mountains of information on our students, their achievement, and our teaching. All schools could have the same information. The New York Performance Assessment Consortium has demonstrated time and time again that performance assessments, teacher-designed and evaluated, have led to higher rates of student success both in and after school.

I wish I could believe that the new Congress, in some yet to be found bi-partisan spirit, would end the reliance on standardized tests—much like the higher- achieving nations we point to with admiration. But when Duncan continues to talk about one standardized, high-stakes evaluation, I have more than a few doubts.

In the meantime, our school will continue to use what we learn about our kids through performance assessments to improve instruction and prepare them for the years after high school -- years where what you can do will count for a lot more than what you can memorize.


Follow my blog every day by bookmarking And for admissions advice, college news and links to campus papers, please check out our Higher Education page at Bookmark it!

By Valerie Strauss  | January 6, 2011; 5:00 AM ET
Categories:  Congress, George Wood, Guest Bloggers, No Child Left Behind, Standardized Tests  | Tags:  arne duncan, disaggregated data, education secretary duncan, elementary and secondary school act, esea, esea reauthoraization, george word, nclb, new congress, new york performance assessment consortium, no child left behind, reauthorization of no child left behind, standardized tests  
Save & Share:  Send E-mail   Facebook   Twitter   Digg   Yahoo Buzz   StumbleUpon   Technorati   Google Buzz   Previous: Resolutions someone should make for 2011
Next: NCLB's 9th anniversary: 'Will there be anything we will need to remember after the test?'


I just now received my free product sample from name brand companies, quite a few of them from "123 Get Samples" online

Posted by: dawnhansen06 | January 6, 2011 5:14 AM | Report abuse

I like the thinking here and the life skills learned. We cannot totally disregard bubble testing until we get SATs and other tests away from standard scoring.

Testing for students is like learning for students...they are not all the same. Having said that, I do like the application part. Years back I offered the idea for a small rural high school to research graveyards in that county. As morbid as it might sound, using those family names, research for the information surrounding the family, size of the graveyard, location, surrounding environment, and you get the photo. It just seemed like a great way to tie state history, math, writing skills, grammar, and sciences together. I could only hear crickets chirp...

Posted by: jbeeler | January 6, 2011 7:50 AM | Report abuse

Well-intentioned ideas - all of them.

I do have some reservations.

In the government class how can anyone be sure who did the work? Was it all done in school, under teacher supervision, or taken home to be done by someone else?

In the chem class, was the work done individually or in cooperative groups? If done in groups, how do teachers then evaluate the individuals from those groups?

I like the phys ed assignment, especially getting the candy bar from the teacher's desk. As an alternative, how about have kids figure out their weight, body mass index, metabolic rates, etc., at the beginning of the semester, allow them to develop their own exercise programs to address these data and then assess them at the end of the semester/year by comparing the two sets of information?

Ohio could well be onto something with this approach but in reality there must be reliability/validity and above all, someone has to consider the costs for correcting/determining the results of these assessment.

The government and chem assessments cannot be slid through a machine and scored instantly. Someone will have to correct the government essays and someone will also need to examine the chem lab results. Both could prove time-consuming and considerably more expensive than the existing state assessments.

Good luck with these as I believe most anything would be an improvement over our ubiquitous reliance on the bubble-jobies.

Posted by: phoss1 | January 6, 2011 8:43 AM | Report abuse

Dr. Wood,
You're brilliant.
Are you going to be speaking at the Save Our Schools rally in Washington DC at the end of July?

Posted by: tutucker | January 6, 2011 12:14 PM | Report abuse

As always, I appreciate the comments on my thoughts, helps me get better at my work. Of course, the posts are limited to fewer words than it takes to explain all the ideas. But it might help phoss1 to know that all the work on these assessments is done during class time. Sometimes students do a reading before class, but the actual work is done in front of the teachers.

Of course the reliability/validity question is always out there. That is why the New York Performance Assessment Consortium moderates all their scoring with other schools.

Further, those 'gold standard' assessments for AP courses are all read and scored by teachers, who share their scoring with other teachers. So it can be done.

By the way, the scoring of the writing samples on the standardized tests in Ohio is done by poorly paid high school graduates with a template. Only a few of them are reviewed and the teachers, students, and parents never see how they were scored so they cannot protest or challenge. That surely is not being accountable.


Posted by: DocWood | January 6, 2011 12:46 PM | Report abuse

When Duncan was the CEo of the Chicago Public Schools, he used to have students with cognitive disabilities take a modified standardized test. The problem with the test was that it was written at about the level of a high school freshman, and most these students couldn't read at a 2nd grade level.

In addition, he threatened to close a school with a high absentee rate (Vaughn Technical). The problem with that was that many of the students there were handicapped and had medical problems (and cognitive disabilities. It's normal for these types of students to be ill more than the normal students.

Posted by: educationlover54 | January 6, 2011 12:51 PM | Report abuse

For those of you who don't know - cognitive disabibled is the new term for mental retardation.

Posted by: educationlover54 | January 6, 2011 12:53 PM | Report abuse

I don't get it. When Duncan was the CEO of CPS everyone considered him nuts, including parents. His name was always mentioned in a derogatory way such as "I have no respect at all for Arne Duncan." Why didn't Obama know about this?

Posted by: educationlover54 | January 6, 2011 12:58 PM | Report abuse

Thanks for the great article DocWood.

Posted by: educationlover54 | January 6, 2011 1:03 PM | Report abuse

Great assessments. Now take the next step and do a correlational study with students' scores on the standardized test. My guess is you will find a high positive correlation, which doesn't invalidate anything you are doing, but should give pause to all the critics of the cheaper and more practical use of standardized tests for large populations.

Posted by: patrickmattimore1 | January 6, 2011 6:58 PM | Report abuse

"My guess is you will find a high positive correlation, which doesn't invalidate anything you are doing, but should give pause to all the critics of the cheaper and more practical use of standardized tests for large populations. "


The SAT, for example, is highly reliable in predicting the need for college remediation, a high score on an AP test is reliable in predicting college success--and even the lowly California state test is good at predicting college success.

All of them are more reliable than grades (based on tests created by teacher) at predicting academic knowledge and ability.

"Filling out bubbles" is no more demeaning than tests, which after all, are "answering one size fits all questions on demand."

Posted by: Cal_Lanier | January 6, 2011 9:19 PM | Report abuse

Again, I appreciate and read the comments. But I must disagree with Mr. Mattimore and Lanier. First, there is no correlation between the performance assessment scores and the standardized test scores. That is because they measure different things--the difference between doing and learning on one hand and memorizing on the other.

As for the predictive value of the SAT, wrong again. Actually, study after study, the most recent being this, point out that high school grades ARE THE SINGLE BEST PREDICTOR OF COLLEGE SUCCESS. The mythology of standardized testing goes on and on and people keep saying stuff like "the SAT predicts college success" without any evidence.

Please, do what we ask our kids to do, provide evidence!!!

G Wood

Posted by: DocWood | January 7, 2011 5:49 PM | Report abuse

I appreciate your response. However, all I'm really asking is that you take the time to do a correlational study between what you are finding with your excellent tests and the standardized ones. The fact that you believe that the two tests are measuring different abilities is interesting, but I suspect that the correlations will be very high. Just like providing metal and wood baseball bats to a group and asking them to hit pitches under the two conditions, my guess is that it won't make much difference in terms of variance from the mean with few exceptions.
If you have already got some of that correlational research and it suggests low positive correlations or no correlations then that would be interesting to know.
SAT Subject tests and the SAT Reasoning test are also testing very different things yet the correlations between the tests are remarkably high. That doesn't necessarily suggest which admission test (or tests) a college should require, if any, but it is a valuable piece of information.

Posted by: patrickmattimore1 | January 8, 2011 4:40 AM | Report abuse

Talk about rolling the clock back a century--I teach a foreign language. A century or two ago students learned grammar and reading, which is easy to assess with a multiple choice test. In the 21st century we focus on production--speaking and writing--which must be tested using a rubric and listening to students speak on a recorder and reading what they've written. They are very labor intensive to score and analyze. Even the National Spanish Exam only tests grammar, vocabulary, reading, and listening comprehension--speaking and writing is too labor intensive to grade.

Arne Duncan, oddly inspired by Peter the Great, is dragging us kicking and screaming back into the 19th century.

Posted by: pattipeg1 | January 8, 2011 9:31 AM | Report abuse

"point out that high school grades ARE THE SINGLE BEST PREDICTOR OF COLLEGE SUCCESS"

What, you think shouting about it changes anything? "College success" has no definition, so they decided to make it "first quarter grades".

But a student's first quarter *classes* are entirely determined by whether or not they are deemed remedial, and the SAT/ACT are used nearly everywhere to determine remediation status. Grades are, of course, useless--if grades were honest, we wouldn't have kids in remediation to begin with.

Evidence: Every public university system in the country uses SAT/ACT scores to determine remedial status. This means that SAT/ACT scores are a one for one correlation with remediation.

I suspect you have no idea what I am talking about, which means you need to up your game before you play.

Patrick is saying something different, btw. But I agree with his point as well.

Posted by: Cal_Lanier | January 9, 2011 12:51 AM | Report abuse

Post a Comment

We encourage users to analyze, comment on and even challenge's articles, blogs, reviews and multimedia features.

User reviews and comments that include profanity or personal attacks or other inappropriate comments or material will be removed from the site. Additionally, entries that are unsigned or contain "signatures" by someone other than the actual author will be removed. Finally, we will take steps to block users who violate any of our posting standards, terms of use or privacy policies or any other policies governing this site. Please review the full rules governing commentaries and discussions.

characters remaining

RSS Feed
Subscribe to The Post

© 2011 The Washington Post Company