A few weeks back the Los Angeles Times began publishing a series of articles on value-added testing in public schools. In a value-added system, a student’s past performance on tests is used to project his or her future results. The divergence between that prediction and the student’s actual performance after a year is the “value” that the educator added or subtracted. The series made available the evaluations of 6,000 elementary school teachers, profiled some of those teachers and generally stirred the pot regarding the ongoing national debate about the value and use of standardized testing in schools.
Educational testing is at the core of this debate about the character and quality of the learning process in our schools. Both sides are in favor of testing. Both sides have used testing to further their agendas in the past. The divergence of opinion occurs when it comes to deciding on how tests are used, what materials students are tested on and how the tests themselves influence choices of what gets taught in the classroom. In order to understand we got to this present-day dispute, it’s useful to look at the history of testing.
Standardized exams have a long history spread over numerous cultures. In 7th century imperial China, government job applicants wrote essays about Confucian philosophy and composed poetry. Students in ancient Greece were rated on the quality of their debates on the mysteries of life. In Europe, the invention of the printing press and modern paper manufacturing fueled the growth of written exams.
At the beginning of the 20th century, educators began researching testing methods that took shortcuts around the old essay system. French psychologist Alfred Binet developed what would become known as the I.Q. test in 1905. As the U.S. entered World War I, the army saw the need to screen the intellectual ability of its recruits. The Committee on Psychological Examination of Recruits was given the task of developing a group intelligence test, which resulted in the US Army Alpha and Beta tests. Though these tests had little impact on the war, they lay the groundwork for future standardized tests, including the Scholastic Aptitude Test (SAT), which was first given in 1926.
For many Americans, caught up in the wonders of this new era, when it seemed as though there was no problem that science couldn’t solve, these tests were seen as efficient tools to help build a society based on merit, not birth or race or wealth.
None-the-less, “modern testing” was being used by some to further their own agendas. Testing “expert” H.H. Goddard identified as “feeble-minded” 83% of Jews, 80% of Hungarians, 79% of Italians and 87% of Russians among a small group of immigrants assessed at Ellis Island during the early years of the 20th century. The fact that many of them couldn’t read or understand English was discounted as irrelevant. Goddard also advocated using intelligence tests to identify people unsuited for human propagation.
The Army’s use of standardized testing was used to “prove” the prevailing perception that non-white races were less intelligent. Questions on the Alpha test, like those asking what Crisco is or identifying the city where Pierce Arrow car was made, had nothing to do with intelligence, but, rather, had everything to do with insuring that recruits were from white middle class backgrounds.
Psychologist Lewis M. Terman, who would go on to spending his later years promoting the enforcement of compulsory sterilization laws in California, boldly predicted in 1916:
“It is safe to predict, that in the near future intelligence tests will bring tens of thousands of …high-grade defectives under the surveillance and protection of society….The time is probably not far distant when intelligence tests will become a recognized and widely used instrument for determining vocational fitness….When thousands of children who have been tested by the Binet scale have been followed out into the industrial world, and their success in various occupations noted, we shall know fairly definitely…the minimum ‘intelligence quotient’ necessary for success in each leading occupation”.
As the above quote indicates, there was more at play here than the re-enforcement of crude sociopolitical nativism or racism. A significant part of the popular “science” of the era was Taylorism, named for Frederick Winslow Taylor, whose theories of scientific efficiency in industrial management held sway throughout the business world. The thinking here was that these dutifully sorted products of the education system would revolutionize production, leading to an era of unlimited prosperity—at least for the tycoons of the day “Scientific efficiency” had boosters in all corners of the ideological arena–Lenin and Mussolini were also big fans of his writings.
Alas, Taylorism failed to consider with the desires of the people who were actually doing the work in the factories of the day, and today he is mostly remembered for his use of the stop-watch to measure workplace tasks. While the people of the era were sufficiently enamored with the notion of scientific progress to elect as President an engineer in 1928, there were limits on just how much “science” they were going incorporate into their lives. That’s why Charlie Chaplin’s ”Modern Times,” with its depiction of a man meshed with a machine and reduced to a machinelike jerkiness of movements, wasn’t particularly amusing to audiences in industrial Pittsburgh.
Not everything about standardized testing has proven to be bad. You could counter the above anecdotes by pointing out that tests have helped eliminate much blatant unfairness. They’ve shown that discrimination is expensive. For example, they exposed the myth of male intellectual superiority when researcher Cyril Burt announced they were, in fact, equal in 1912. After World War II, when colleges started competing on their students’ average SAT scores, they found that the easiest way to get more bright students was to stop discriminating against women. Similarly, this competition for brains also induced Ivy League colleges to finally stop discriminating against Jews, who, ironically, have traditionally been among the highest scoring ethnic groups.
In 1988, Congress created the National Assessment Governing Board. It set out new standards for the National Assessment of Educational Progress, a test that has been given to a sampling of students since 1970. In 2002, President Bush signed the No Child Left Behind law. For the first time, it required annual testing of all public school children in certain grades and required states to use results to help rate schools.
For better or worse, testing is now part of the fabric of American education. The questions that remain have to do with the purpose and results of the scores that students generate when they take those tests. Traditionalist reformers insist on using test results for “evaluating” schools and teachers, with a variety of punitive measures to be applied for those who don’t make the grade. Progressive reformers are quick to point out that tests’ objectivity and the environment that students come from can skew the results.
And then there are the square pegs of the education system, like the successful individuals who avoided the formal definitions of achievement in education–and most likely would have lowered their teacher’s efficiency ratings–like Bill Gates, Paul Allen, Ted Turner, and Steve Jobs.
The single largest predictor of student tests scores is poverty. Not how much teachers are paid. Not how many kids are in the classroom. Not the teaching styles or systems used. The incidence of poverty, the depth of poverty, the duration of poverty, the timing of poverty (age of child), community characteristics (concentration of poverty and crime in neighborhood, and school characteristics) and the impact poverty has on the child’s social network (parents, relatives and neighbors) all count. Nothing else does.
And guess what? The United States has the highest rate of child poverty (21.9%) among the developed nations.
Over the past five years, with No Child Left Behind as the law of the land, scores for minority students increased, but so did those of white students, leaving the achievement gap stubbornly wide. Black and Hispanic elementary, middle and high school students all score much higher on federal tests than they did three decades ago. However, most of those gains were not made in recent years, but during the desegregation efforts of the 1970s and 1980s. That was well before the 2001 passage of the No Child law, the official description of which is “An Act to Close the Achievement Gap.”
So what’s the point of all this testing? Tests tell us where we are and point towards where we have to go. That’s a good thing. Tests have also been used to punish and sort people into classes. That’s a bad thing.
One other thing: in a society obsessed with ratings and status, tests are being used to send the message that we’re failing our children. And that’s all about people manipulating data to further their political agendas, whether it’s increasing funding or closing down teachers unions. Comparing American students to their peers in other countries is not a completely straightforward process. Generally speaking, there are ample comparisons that suggest that the United States is far from last in the world (although there is room to get better). Can we do better on tests? You betcha. As long as we keep the whole testing business in perspective.
Coming Next! Part Three of this series will talk about San Diego as Ground Zero in the current debate and the broader question of what makes for a great education.