Front Page
  Current Issue
  Back Issues
  About National CrossTalk

National CrossTalk Fall 1999
News Editorial Other Voices Interview

1 of 2 Stories

Can a Thermometer Cure a Fever?
The role of testing in educational reform

By Rebecca Zwick

IS STANDARDIZED TESTING a gateway or a gatekeeper, a road to equal opportunity or a means of maintaining white male privilege? American public opinion always has been sharply divided. But you’d never know it from the last presidential campaign, in which the candidates vied for the position of most enthusiastic test booster.

In one of the debates, George W. Bush scolded Al Gore for allegedly favoring only “voluntary” testing of America’s students. “You can’t have voluntary testing,” Bush insisted. “You must have mandatory testing. You must say that if you receive money, you must show us whether or not children are learning to read and write and add and subtract…Testing is the cornerstone of reform.”

Now, President Bush has a blueprint for education reform based on this principle. Called “No Child Left Behind,” the plan, which forms the basis for bills pending in the Senate and the House, says that “schools must have clear measurable goals focused on basic skills and essential knowledge. Requiring annual state assessments in math and reading in grades 3–8 will ensure that the goals are being met for every child, every year.”

But of course testing alone ensures no such thing. Because tests are very visible and can be put into place quickly, they often are instituted as the first step in educational reform, before changes in curriculum standards and instruction are put into place. And testing can divert resources that could otherwise be used to implement these crucial changes.

Many states have made testing a centerpiece of their education programs inrecent years, only to find that improvements in student learning did not obediently follow the implementation of the new assessments. Alaska, Arizona, Illinois and Massachusetts are among the states reporting failure rates of 50 percent or more on some components of their statewide exams.

Some schools have resorted to extraordinary means in order to demonstrate score increases, including an Oregon elementary school that was acclaimed as the state’s most improved school earlier this year. The school was found to have tested only 55 percent of its third graders with the standard state reading exam, mainly as a result of exempting students with limited English proficiency. Although the school was evidently playing by the rules, its participation rate was far lower than the state average of 90 percent.

Outright cheating by school personnel on standardized tests has been reported in at least a dozen states in recent years. The massive test cheating scandal in New York City, which allegedly involved more than 50 educators, is still in the news two years after it was brought to light.

Testing proponents argue that despite its rocky start, the standards-based reform movement, which emphasizes accountability through testing, will ultimately boost student achievement. The new approach, they claim, just needs some time to work. In the meantime, what could be bad about monitoring student learning? One reply comes from an unlikely test critic—Greg Anrig, who was the third president of Educational Testing Service. Anrig used to say that testing grade school kids on a frequent basis is like repeatedly pulling up carrots to see how they’re growing. Testing, in other words, can interrupt the very process it is intended to assess.

The amount of classroom time spent on testing has escalated dramatically in recent years. In California, which leads the nation in terms of hours devoted to standardized testing, according to a recent Education Week survey, students in grades two through 11 spend an average of six to eight hours per year on tests. And the amount of time devoted to assessment is likely to increase nationwide: In addition to annual state testing of students in grades three through eight, the Bush plan declares that “a sample of students in each state will be assessed annually with the National Assessment of Educational Pro g ress (NAEP) fourth and eighth grade assessment in reading and math.”

In order to receive their full share of federal education dollars, states will have to demonstrate progress by “disadvantaged” students on the states’ own tests, and these gains will have to be “confirmed” by the NAEP results. (The House version of the bill gives states the option of confirming their results with other tests that meet “widely recognized professional and technical standards.”) It’s no wonder that Bush’s reform package is referred to in some government circles as “No Child Left Untested.”

And of course, it is not merely the testing time itself that is lost when new assessment programs are added. Teachers, parents and researchers all have bemoaned the “teaching to the test” phenomenon, in which test preparation drills crowd out instruction on more complex and important material.

In a national survey of public school teachers conducted by Education Week in 2000, nearly 70 percent of teachers said that state standards have caused instruction to focus “far too much” or “somewhat too much” on tests. One teacher quoted at a National Education Association convention last year vividly described the current testing frenzy as an “education-eating bacteria” that is overtaking our schools.

According to some critics, teaching to the test is the primary explanation for the “Texas Miracle”—the large score gains on statewide tests for both minority and white students in Bush’s home state. To see if these increases were reflected in other measures of achievement, researchers at the Rand Corporation compared scores on the Texas Assessment of Academic Skills (TAAS) to results for Texas and for the nation on the National Assessment of Educational Progress—the very test that is to be used to confirm state gains, according to “No Child Left Behind.”

The researchers, Stephen P. Klein, Laura S. Hamilton, Daniel F. McCaffrey and Brian M. Stecher, focused on changes in fourth grade math and reading achievement and eighth grade math achievement during the 1990s. (Data were not available for an analysis of eighth grade reading.) TAAS and NAEP gains were compared in terms of “standardized differences,” obtained by dividing the change in the average score by the standard deviation, an index of the variability of the scores. Although NAEP results confirmed that school achievement in Texas improved, only in fourth grade math were the Texas gains substantially greater than those for the nation as a whole.

More significantly, the score gains on the TAAS dwarfed the NAEP increases, especially for minority students. For example, between 1994 and 1998, the increase in fourth grade reading achievement for African American students on the TAAS was about three times as large as the gain on NAEP. And while the gap between minorities and whites on the TAAS shrank between 1994 and 1998, this decrease was not paralleled by the NAEP results. (A report just released by the National Education Goals panel, a bipartisan group of governors and legislators, shows that the Texas score gap on NAEP held steady during the 1990s, lending support to the Rand conclusions.)

What is the reason for the discrepancies between NAEP and TAAS? The Rand researchers speculated that “many schools are devoting a great deal of class time to highly specific TAAS preparation. It is also plausible that the schools with relatively large percentages of minority and poor students may be doing this more than other schools.” The authors reasoned that the preparation must have been quite narrow in scope because, “if TAAS scores were affected by test preparation, then the effects did not appear to generalize to the NAEP exams.”

Just as it occupies classroom time, testing, of course, drains financial resources as well. A question that is all too rarely asked is, “Could the money expended to add more testing be put to use in a more effective way?” According to one state testing director, the cost of assessing a child is roughly $15 per year, including test development, administration, scoring, analysis and reporting. Not a huge sum, perhaps, but under the Bush plan, that’s $15 per year for every third through eighth grader in the United States. One testing expert anticipates that the Bush plan will add $150 million to the states’ expenditures on K–12 testing, currently estimated to be about $400 million. How else could we spend that money? What if it were used to increase teachers’ salaries and improve their continuing education opportunities; or to beef up course offerings and tutoring programs for students; or to repair decaying school buildings and expand libraries and computing facilities? Can promoters of increased assessment make the case that adding tests is a more effective use of resources?

On the subject of the “No Child Left Behind” proposal, Democratic Senator Barbara Mikulski of Maryland remarked, “We’re worried that no child be left out of the appropriations process.” Education reform requires a commitment of resources to the improvement of teaching and learning, especially in poor communities. Testing should follow rather than precede these changes.

Thermometers don’t cure fevers, and testing does not fix school problems. Testing is not the cornerstone of educational reform. Learning is.

E-Mail this link to a friend.
Enter your friend's e-mail address:



National Center logo
© 2000 The National Center for Public Policy and Higher Education

HOME | about us | center news | reports & papers | national crosstalk | search | links | contact

site managed by NETView Communications