No Country Left Behind

By Rodger W . Bybee, Elizabeth Stage

International comparisons of student achievement tell U.S. educators where they must focus their efforts to create the schools the country needs.

“Economic Time Bomb: U.S. Teens Are Among Worst at Math,” blared the December 7, 2004, Wall Street Journal headline over a story about the disheartening results of the latest international assessment of student achievement. The New York Times and Washington Post also carried major stories, albeit with slightly more temperate headlines. All the stories agreed that the results of the Program for International Assessment (PISA), which tested 15-year-olds from 41 countries, are cause for grave concern. On the math section, the United States ranked 24th out of 29 member nations of the Organization for Economic Cooperation and Development (OECD), falling below Poland, Hungary, and Spain in the three years since the previous assessment. For a country that prides itself on its scientific and technological prowess, this seems disastrous. But is the situation as bad as the PISA test results indicates?

The following week, the release of the results of another assessment, the Trends in International Mathematics and Science Study (TIMSS), which evaluates students in the fourth and eighth grades, told a somewhat different story. U.S. student performance had improved in the four years since the previous assessment. Of perhaps even greater significance, the troubling gap in math performance between white and black students had contracted. With the proportion of racial and ethnic minorities in the United States growing steadily, the nation’s leaders are concerned that some groups have significantly lower test scores than the white majority. The gap was closing in the 1970s and 1980s, but little progress was seen in the 1990s.

U.S. science and math education are criticallyimportant to the nation’s future, and the results of these tests do have important lessons that we must take into account. But if we’re going to be influenced by these tests, we must ensure that we properly understand what the tests are measuring and what the results mean.

The newspapers are correct to highlight these international studies, because they contain information that the United States needs to know. The United States views itself as a leader in an increasingly global economy, and its prowess in science and mathematics is essential to that leadership. Policymakers, business leaders, educators, and parents look to these studies to gain some perspective on the quality of educational systems throughout the world. They understand that science and technology are driving rapid change and that the training of the next generation will be a key factor in how successful a country will be in the future. The point is not to win some imaginary math Olympics this year, but to find out which countries are attaining success in educating young people, so that all nations can benefit from what they are doing right and integrate effective practices and policies into their education systems.

Understanding the tests

If we are to learn the relevant lessons from TIMSS and PISA, we must begin by understanding how they differ from one another. TIMSS examines student performance and the background characteristics of students, teachers, and schools. Assessment items, which are developed through a consensus of representatives to the International Association for the Evaluation of Educational Achievement (IEA), are designed to link directly to the curricula of the participating countries. The TIMSS report thus specifies what students are expected to learn and how well they are learning it.

The recent TIMSS report is the third in a series. In 1995, half a million students in more than 40 countries were tested in mathematics and science at grades 4, 8, and 12; in 1999, TIMSS assessed eighth-grade students in mathematics and science. In 2003, students from an expanded group of 49 countries were tested in grade 4, grade 8, or both. It is now possible to compare fourth-graders from 1995 and 2003 and to compare eighth-graders from 1995, 1999, and 2003. For the 2003 assessment, IEA revised the TIMSS framework to incorporate the curricular changes in the participating countries, so that it accurately reflects what educators in all these countries are currently including in their science and math curricula.

Box 1 displays several items from the 2003 TIMSS assessment. We have noted the item, correct answer, and percentage of U.S. students receiving full credit. In general, TIMSS items are straightforward questions that require the recall of factual information that students are expected to learn in their math and science classes.

Box 1
TRENDS IN INTERNATIONAL MATHEMATICS AND SCIENCE STUDY 2003
Sample questions, grades 4 and 8

Grade 4 math

There are 600 balls in a box, and 1/3 of the balls are red. How many red balls are in the box?

Answer: 200 red balls

35 percent of U.S. students received full credit.

Grade 4 earth science

Kate sees a full moon. About how much time will go by before the next full moon?

a) one week b) two weeks c) one month d) one year

Answer c) one month

42 percent of U.S. students received full credit.

Grade 8 math

If n is a negative integer, which of these is the largest number?

a) 3 + n b) 3 x n c) 3 – n d) 3 ÷ n

Answer c) 3 – n

48 percent of U.S. students received full credit.

Grade 8 math

If 4(x+ 5) = 80, then x=

Answer 15

58 percent of U.S. students received full credit.

Grade 8 math

Which of these is the LEAST amount of time?

a) 1 day b) 20 hours c) 1800 minutes d) 90,000 seconds

Answer b) 20 hours

48 percent of U.S. students received full credit.

Grade 8 environmental science

The burning of fossil fuels has increased the carbon dioxide content of the atmosphere. What is a possible effect that the increased amount of carbon dioxide is likely to have on our planet?

a) A warmer climate b) A cooler climate c) Lower relative humidity d) More ozone in the atmosphere

Answer a) A warmer climate

56 percent of U.S. students received full credit

PISA, developed by the OECD, has a different purpose. It measures literacy in reading, mathematics, and science in 15-year-olds. In mathematics, PISA assesses how well young adults can recognize and interpret mathematical problems in their world, translate problems into a mathematical context, and use mathematical knowledge and procedures to solve problems. Scientific literacy reflects students’ ability to use scientific knowledge, to recognize scientific questions, and to relate scientific data to claims and conclusions. Students are also expected to communicate solutions effectively. PISA is not directly tied to the school curriculum but was conceived and designed to assess the practical outcomes of education systems. In other words, the PISA assessment aims to determine whether students not only have the knowledge they need but also the ability to use it to solve problems. Tests are administered to 15-year-olds, because that is typically the last year of compulsory schooling in participating countries. PISA is administered every three years in all three subject areas, and each administration includes a more comprehensive assessment of one of the three areas. The most recent version emphasized math and also added a separate section on problem-solving that was independent of any particular content area. In Box 2, we provide two sample items from the 2003 PISA program. The correct answers and percent correct for U.S. students are shown.

Box 2
PROGRAM FOR INTERNATIONAL STUDENT ASSESSMENT 2003
Sample questions for 15-year-olds

Mark (from Sydney, Australia) and Hans (from Berlin, Germany) often communicate with each other using “chat” on the Internet. They have to log on to the Internet at the same time to be able to chat.

To find a suitable time to chat, Mark looked up a chart of world times and found the following: When it is 12 midnight in Greenwich, England, it is 1:00 AM in Berlin and 10:00 AM in Sydney.

Item 1: At 7:00 PM in Sydney, what time is it in Berlin?

To get full credit on the item a student needed to answer 10 AM or 10:00.

Percent Correct
United States: 45.7
OECD: 53.7

Item 2: Mark and Hans are not able to chat between 9:00 AM and 4:30 PM their local time, as they have togo to school. Also, from 11:00 PM till 7:00 AM their local time they won’t be able to chat because they will be sleeping.

When would be a good time for Mark and Hans to chat? Write the local times for Sydney and Berlin.

To get full credit on the item, a student needed to supply any time or interval of time satisfying the 9-hour time difference, such as the following:

Sydney: 4:30 PM – 6:00 PM; Berlin: 7:30 AM – 9:00 AM
or
Sydney: 7:00 AM – 8:00 AM; Berlin: 10:00 PM – 11:00 PM

Percent Correct
United States: 28.0
OECD: 28.8

TIMSS and PISA can be viewed as complementary measures. TIMSS focuses tightly on specific skills that are part of the curriculum. PISA is measuring much more general skills that are considered desirable outcomes of a solid education but are discrete curricular items. Educators want to have success in both domains, but they recognize that different policies, programs, and practices are relevant for each. U.S. educators need to carefully tease out the implications of the results of the two tests.

The U.S. report card

The news for the United States is both not so good and pretty bad. In the not so good category, TIMSS 2003 reports that U.S. fourth-graders have maintained their science and math performance since 1995. Overall scores were stable, and the percentage scoring in the High and Advanced categories remained the same. During the period, several other countries improved their performance and now rank higher than the United States.

One encouraging development is that African-American students did raise their scores. Their scores are still significantly lower than those of their white peers, but indications that the gap is narrowing are encouraging.

Science has to join reading, writing, and math as a keystone of the core curriculum.

TIMSS assessed eighth-graders in five areas of math: number, algebra, measurement, geometry, and data. U.S. students showed considerable improvement in algebra and data, possibly reflecting additional emphasis on these curricular areas by states and school districts.

In science, TIMSS assessed achievement in chemistry, earth science, environmental science, life science, and physics. U.S. students increased their performance in earth science and physics, which may well be a reflection of the balanced emphasis on life, earth, and physical science in the recently developed national science education standards. The combination of the earlier poor performance of U.S. students in physical science assessment and the guidance found in the recently developed standards for science education might have motivated a shift in the upper elementary- and middle-school curriculum in many schools.

Although white students continued to outperform black and Hispanic students in science, it is encouraging that both groups narrowed the gap in the science assessment. We speculate that an increased emphasis on carefully defining student outcomes through national and state standards and subsequent state and local efforts to improve the content knowledge of teachers are at least partly responsible. In addition, the No Child Left Behind (NCLB) legislation requires schools to report test results separately for each racial and ethnic group, which forces schools to pay more attention to underperforming minority students.

The PISA results tell a less encouraging story. The test assessed performance on a number of broad mathematics topics: space and shape, change and relationships, quantity, and uncertainty. U.S. performance in mathematics literacy and problem-solving was lower than the average for OECD countries. U.S. scores were roughly the same as they were in 2000, whereas some other countries improved their performance and moved ahead of the United States in the rankings. Roughly two-thirds of OECD countries outperformed the United States. The United States had more students at the lowest levels of performance and fewer students at the highest levels than the OECD average percentages. The narrowing of the performance gap among U.S. subgroups that was apparent in the TIMSS results was not found in the PISA scores. In both mathematical literacy and problem-solving, white students scored significantly above the OECD average, whereas Hispanic and African-American students were significantly below the OECD average.

How do we explain these disparities? Is it because by assessing the application of concepts rather than the recall of facts, the PISA test requires higher levels of understanding? Is it because U.S. schools emphasize the acquisition of information at the expense of problem-solving and the application of knowledge? Is it because the United States, in its effort to close the achievement gap, has emphasized basic knowledge to help underachievers rather than ensuring that all students learn challenging material? Is it because PISA items require more critical reading and reasoning from evidence? Our answer is (e): all of these. Each of these explanations accounts for some of the shortcomings in student performance, and each must be considered as we look for ways to improve U.S. math and science education.

No child left behind

Most of the discussion of education reform in the United States revolves around NCLB, and it is tempting to view the TIMSS and PISA results as a measure of the legislation’s success. But that would be a mistake. NCLB became law in 2001, much too late to have any significant influence on the educational system and an assessment administered in 2003. Further, NCLB does not mandate any changes for science until 2005–2006, when states must have new science standards in place. The first NCLB-mandated assessment of student achievement in science will occur in the 2007–2008 school year. The new international assessments cannot tell us anything about the effects of NCLB, but they can provide insights into what NCLB should be expected to improve and how it should be implemented to be most effective.

Closing the achievement gap is a primary NCLB goal, and the international tests confirm that the gap is real. But the TIMSS results also reveal that it has been possible to narrow the gap among eighth-graders. To what can we attribute that success? Are the focus on literacy for all students and the standards movement of the past decade now achieving results? What should we pay more attention to? How can we best maintain the progress of all students? The TIMSS results suggest that the country is on a good course in improving student knowledge of basic facts and procedures and should persevere with its current strategy. The PISA results indicate that the country has not been closing the enormous gap in student achievement to meet the higher demands of critical evaluation and the application of knowledge to problem-solving. The lesson is that in addition to ensuring that all students know the basics, schools must find ways to develop students’ ability to think critically and creatively about how to use their knowledge when confronted by an unfamiliar problem. But are U.S. teachers up to the challenge?

Few question the need for highly qualified teachers, and we support the federal effort to ensure that teachers meet a rigorous standard of content knowledge. But what content is fundamental to be highly qualified? And how, when, and where can teachers best gain this knowledge? Several studies indicate that lack of teacher content knowledge is a limiting factor in raising student achievement and that more than just “straight” content knowledge is needed. Teachers must also understand major conceptual ideas in mathematics and science, how those concepts are related to one another, and how they are applied in solving problems. The content that teachers need to know may be different from that required for professional mathematicians, scientists, architects, or engineers. We think that the ideas of Liping Ma of the Carnegie Foundation for the Advancement of Teaching about emphasizing profound understanding of fundamental mathematics should be extended to an emphasis on profound scientific ideas. In other words, teacher education and assessments of teacher qualifications should not stop at an emphasis on facts and procedures but need to include the critical evaluation of information and the application of procedures in context. In other words, teachers need to do well on PISA standards as well as TIMSS standards.

According to the Council of Chief State School Officers, there has been a rapid increase in the demand for mathematics and science courses by middle-and high-school students during the past several years. Since 1990, there has been an 11 percent increase in eighth-grade students taking algebra. This may partially explain the increases in scores for eighth-grade students on the 2003 TIMSS. At the high-school level, the picture is more complicated. Although the demand for more challenging courses is also present in grades 9 through 12, and the numbers of teachers assigned to teach these classes has increased, these developments have been accompanied by the use of part-time teachers or teachers with more than one subject assignment. Indeed, in all high-school mathematics and science courses, the percentage of teachers with certification in those disciplines has decreased. The lack of qualified high-school math and science teachers is certainly one factor that should be considered in trying to understand the disappointing performance of 15-year-olds on the PISA assessment. Policymakers and those in charge of teacher professional development programs should pay special attention to what can be done to improve the quality of high-school math and science teaching.

The TIMSS study reports that there are substantial differences in how the participating countries define requirements for teaching. U.S. teachers, for example, are required to complete a practicum, have a university degree, and perform satisfactorily during a probationary teaching period, but unlike their peers in many high-achieving countries, they are not required to pass an examination or complete one- to two-year supervised inductions to teaching. Policymakers and education officials need to consider the wisdom of adding these requirements for U.S. teachers.

Implementing challenging curricula

There is a clear, and we think compelling, need for continued development and implementation of new and innovative curriculum materials. In particular, U.S. educators would benefit from research-based programs that incorporate contemporary evidence about student learning with the insights of effective teachers to develop curricula and programs that enhance student achievement. It appears that most U.S. math and science classes have not implemented challenging instructional materials for all students, but one cannot say for sure because it is difficult to monitor curricular implementation. Unlike the vast majority of countries participating in international assessments, the United States has no official national curriculum.

When TIMSS researchers had to describe the U.S. science curriculum, they had to examine themost popular science textbooks and deduce from them a de facto national curriculum. From what they could determine by examining textbooks, it appears that although they provide a solid conceptual knowledge of math and science, they put relatively little emphasis on educational experiences that would facilitate the development of problem-solving, reasoning, and critical thinking. U.S. students are introduced to more topics than their peers in high-achieving countries, but this broad but thin coverage does not seem to result in good understanding. Michigan State University curriculum expert William H. Schmidt, who was among the first to identify the overcrowded U.S. curriculum as a problem, points out that even U.S. efforts to create an alternative curriculum do not go far enough. He notes that although National Science Foundation–funded curricula are generally more coherent and more oriented toward problem-solving than what is found in popular textbooks, they are still quite broad by the standards of the rest of the world. We recommend that U.S. educators continue to use the insights that can be gained from research, the results of the National Assessment of Education Progress, and international comparisons to design and implement more challenging and effective curricula.

The National Research Council report How People Learn synthesizes recent research in the cognitive sciences and applies it to education. The report argues that curricula must be organized in such a way that teachers can work with the current understanding that students bring into their classrooms, so that new knowledge is meaningfully integrated. Further, it finds that teachers must teach some selected subject matter such as mathematics and science in depth to give students a firm foundation of fundamental concepts that will instill meaning into factual knowledge. Likewise, accountability assessments must test deep understanding rather than surface knowledge.

Taken together, these observations generate some reasonable proposals about what is needed in curriculum organization and implementation. This nation needs continuous support for innovative science and mathematics programs, research supporting and identifying the next generation of innovations that enhance learning, and continuous support for the professional development of teachers.

Assessing the tests

Although there is no doubt that we have gained valuable insights from the current wave of state, national, and international assessments, the tests themselves are beginning to emerge as a potential problem. Some students, teachers, and parents have begun to complain that we are now spending so much time assessing student achievement that we are squeezing out valuable instructional time. The complaint is not unreasonable, and policymakers and educators need research to measure the beneficial and detrimental effects of large-scale assessments on the educational system. Do assessments such as TIMSS and PISA support productive changes? Do they complement state and local initiatives? Do classroom teachers perceive any relationship between large-scale assessments and the formative assessments they use? NCLB currently requires assessing all students in reading and mathematics annually from grades 3 through 8, and once more in high school. In 2007, science assessment will be required at the elementary-, middle-, and high-school levels. Is this enough? Too much?

The tests are growing not only in number but in power. About half the states now have (or are phasing in) exit exams for high-school graduation, and some want to make passing the exam a requirement for receiving a diploma. The testing required for NCLB is becoming increasingly important for schools and school districts, affecting funding and management control. Just as curricula were subjected to research in the 1960s and 1970s, it seems that this “age of assessment and accountability” also should be the object of substantial, continuing, and cumulative knowledge based on the best research practices. If we are to continue improving education, no aspect of the system can be exempt from critical evaluation.

The results of the TIMSS and PISA assessments reinforce several important themes that are gaining prominence in education. The first is that most U.S. students do not receive lessons that portray math and science as dynamic disciplines that encourage conjecture, investigation, theorizing, and application. Rather, most lessons characterize math and science as static bodies of factual knowledge and procedures. Although it is more challenging for students to grasp the dynamic dimensions of math and science, it is also more stimulating. If we want students to learn more, we have to make them want to learn more. In this school of dreams, make it exciting and they will learn.

The second is that science has to join reading, writing, and math as a keystone of the core curriculum. This will happen in an official sense in 2007–2008 when science becomes part of the testing regimen required by NCLB, but it will become a real part of the culture only when people realize that science education is not an accumulation of facts but an initiation into a way of perceiving the world and of solving problems. When the nation develops that understanding of science, it will demand a dynamic curriculum, and the results will be reflected in future PISA assessments.

The third critical theme is that the United States must redouble its efforts to close the achievement gap. The progress reflected in the TIMSS results is an indication that the gap is not inevitable, and we believe that the NCLB emphasis on improving the performance of all subgroups will help keep the focus on closing the gap. But the PISA results are a sobering reminder of how far the country has to travel. It is not sufficient to close the gap in basic skills; the goal must be to challenge all students to fulfill their potential to develop the advanced analytic and problem-solving skills that should be the ultimate goal of education.

During the past decade, the science and math education communities have developed the standards that can serve as the foundation for revitalized K-12 education. The international assessments should provide the motivation to transform these standards into effective curricula and to give teachers the knowledge and skills they need to inspire and direct their students to higher achievement. It would be a tragic mistake to view these results as a sign that the reform movement that has grown in recent years is a failure. They must be seen as useful indicators of where reform is already yielding results and where the nation’s education system must move next to finish the job of building a 21st-century education system.

Search Issues

No Country Left Behind

Understanding the tests

Grade 4 math

Grade 4 earth science

Grade 8 math

Grade 8 math

Grade 8 math

Grade 8 environmental science

The U.S. report card

No child left behind

Implementing challenging curricula

Assessing the tests

Join the Conversation