Anna Fisher was leading an undergraduate seminar on the subject of attention and distractibility in young children when she noticed that the walls of her classroom were bare. That got her thinking about kindergarten classrooms, which are typically decorated with cheerful posters, multicolored maps, charts and artwork. What effect, she wondered, does all that visual stimulation have on children, who are far more susceptible to distraction than her students at Carnegie Mellon University? Do the decorations affect youngsters' ability to learn?

To find out, Fisher's graduate student Karrie Godwin designed an experiment involving kindergartners at Carnegie Mellon's Children's School, a campus laboratory school. Two groups of 12 kindergartners sat in a room that was alternately decorated with Godwin's purchases or stripped bare and listened to three stories about science in each setting. Researchers videotaped the students and later noted how much each child was paying attention. At the end of the reading, the children were asked questions about what they had heard. Those in the bare classroom were more likely to pay attention and scored higher on comprehension tests.

Hundreds of experiments like Fisher's are part of an effort to bring more rigorous science to U.S. classrooms. The movement started with former president George W. Bush's No Child Left Behind Act and has continued under President Barack Obama. In 2002 the Department of Education established the Institute of Education Sciences (IES) to encourage researchers to pursue what was described as “scientifically valid research,” especially randomized controlled trials, which advocates of IES considered the gold standard. The government also created the What Works Clearinghouse to provide a database of results for classroom educators on everything from reviews of particular curricula to evidence-based teaching techniques.

Now researchers are using emerging technology and new methods of data analysis to create experiments that would have been impossible to carry out even 10 years ago. Video cameras track eye movements to see where students are directing their attention; skin sensors report whether students are engaged or bored. Economists have figured out how to crunch data to mimic randomized trials—which are often difficult and expensive to implement in schools.

Much of the new research goes beyond the simple metric of standardized tests to study learning in progress. “I am interested in measuring what really matters,” said Paulo Blikstein, an assistant professor at the Stanford Graduate School of Education. “We have been developing new technologies and new data-collection methods to capture the process.” How well students complete a task is just part of the experiment; researchers also record students' eye gaze, galvanic skin response and exchanges with fellow students, among other things. Blikstein calls this approach “multimodal learning analytics.”

The new methodology is already challenging widely held beliefs by finding that teachers cannot be judged solely on the basis of their academic credentials, that classroom size is not always paramount and that students may actually be more engaged if they struggle to complete a classroom assignment. Although these studies have not come up with the “silver bullet” to cure all that ails American schools, the findings are beginning to fill in some blanks in that hugely complex puzzle called education.

Looking for Patterns
Provocative questions are yielding some of the most surprising results. In a series of experiments with middle school and high school students, Blikstein is trying to understand the best ways to teach math and science by going beyond relatively primitive tools like multiple-choice tests to assess students' knowledge. “A lot of what happens in engineering and science is the failure,” he says. “You try something, it doesn't work, then you reevaluate your ideas; you go back and try it again with a new set of ideas.” That is one of the processes he hopes to capture with these new tools: “We bring kids to the lab, and we run studies where we tell them to build some kind of engineering or science project.” The researchers put sensors in the lab and sometimes on the kids themselves. Then they collect the data and analyze them to look for patterns. “There are lot of counterintuitive things in how people learn,” Blikstein notes. “We like to reveal that an intuition we have is sometimes wrong.”

“Discovery” learning, in which students discover facts for themselves rather than receiving them directly from an instructor, has been in vogue lately; Blikstein and his colleagues at FabLab@School, a network of educational workshops Blikstein created in 2009, are trying to get at the heart of how much or how little instruction students really need. Parents may not like to see their kids frustrated in school, but Blikstein says that “there are levels of frustration and failure that are very productive, are very good ways to learn.” In one set of studies, he and his colleagues tried to find out whether students learned more about a science topic if they first saw either a lecture or did an exploratory activity. Seeing the lecture first is called “tell and practice,” he says. “First you're told, then you practice.” Students were divided into two groups: one started with the lecture, and the other started with the exploratory activity. The researchers repeated the experiment in several studies and found fairly consistent results: students who practiced first performed 25 percent better than students who listened to a lecture first. “The idea here is that if you have a lecture first and you haven't explored the problem by yourself a little bit, you don't even know what questions the lecturing is answering,” Blikstein says.

The new tools and methods of data analysis are making education research more efficient and precise. Jordan Matsudaira, a management and policy professor at Cornell University, has helped resurrect an old research tool and has employed it to look at the usefulness of summer school and the effect of funding from Title I, a federal program targeted at schools with a certain percentage of low-income students. The method, known as regression-discontinuity analysis, compares two groups of students on either side of a particular threshold. For example, in the study on summer school, Matsudaira compared students whose test scores were just above the level that made them eligible for summer school with those who were just below it to see if the extra schooling improved students' test scores. The design is used to mimic randomized controlled trials.

His conclusion: summer school could be a more cost-effective way of raising test scores than reducing class size.

In the Title I study, Matsudaira compared schools that fell just above the limit required to get the federal funds with those just below it. He found that the money did not make much of a difference in the academic achievement of the students most likely to be affected. But it also illustrated some of the limits of the research design. It is possible that schools with a much higher percentage of poor students might derive a greater benefit from the extra money. It is also possible that schools so close to the threshold would use the money for one-time expenditures rather than long-term investments because they cannot be certain that their population would remain the same and that they would continue to be eligible for the federal aid in the future.

Other researchers are mining data to track the progress of many students over time. Ryan Baker, an associate professor at Teachers College, Columbia University, and president of the International Educational Data Mining Society, recalls that when he was working on his Ph.D. in the early 2000s, he got up every morning at 6 a.m. to drive out to a school where he would spend the entire day standing on his feet taking notes on a clipboard. Fast-forward a decade, and Baker's work routine looks very different. He and his colleagues recently completed a seven-year longitudinal study, funded by the National Science Foundation, looking at log files of how thousands of middle school students used a Web-based math-tutoring program called ASSISTments. The researchers then tracked whether the students went to college and, if they did, how selective the college was and what they majored in to see whether they could make connections between students' use of the software and their later academic achievements.

“Big data allows us to look over long periods, and it allows us to look in very fine detail,” Baker says. He and his colleagues were particularly interested in seeing what happened to students who were “gaming” the system—trying to get through a particular set of problems without following all the steps. “Whether you are intentionally misusing the educational software to get through that learning is a better predictor of whether you'll go to college than how much you show up to class,” he says. It turns out that gaming the easier problems was not as harmful as gaming the harder problems. Students who gamed the easier problems could have simply been bored, whereas students who gamed the harder problems might not have understood the material. Baker thinks this kind of information could ultimately help teachers and guidance counselors figure out not only which students are at risk of academic problems but also why they are at risk and what can be done to help them.

Building an Evidence Base
The new studies are helping to build an evidence base that has long been missing in education. Grover Whitehurst, founding director of IES, recalls that when he started in 2002, just after No Child Left Behind took effect, the superintendent of a predominantly minority district asked him to suggest a math curriculum that had been proved effective for his students. “I said, ‘There isn't any,'” Whitehurst says. “He couldn't believe that he was being required by law to base everything he did on scientifically based research, and there was none.” That superintendent was far from alone, points out Whitehurst, who is now director of the Brown Center on Education Policy and a senior fellow at the Brookings Institution. “There was very little research that actually spoke to the needs of policy makers and educators. It was mostly research written by academics and schools of education to be read by academics and schools of education. That was about as far as it went.”

Many researchers would disagree with that harsh assessment. Yet the criticism pushed the community to examine and explain its methods and mission. In the early years of IES, Whitehurst and others frequently compared education science with drug studies, indicating that people who study schools should test curricula or learning practices the way a pharmaceutical researcher might test a new drug. Strategies and curricula that passed that test would go into the What Works Clearinghouse.

John Easton, current director of IES and a former educational researcher at the University of Chicago, believes the clearinghouse is particularly useful as a way for the government to vet products that school districts might feel pressured to buy. “I think it's a really valuable source, a trusted source where you can go and find out if there is any evidence that this commercial product works,” he says. The clearinghouse now houses more than 500 reports that summarize current findings on such topics as math instruction for young children, elementary school writing and helping students with the college application process. It has also reviewed hundreds of thousands of reports to aid in distinguishing the best-quality research from weaker work, including studies on such subjects as the effectiveness of charter schools and merit pay for teachers, which have informed the ongoing debate about these issues.

One of the most important contributions of the government's emphasis on rigorous science, Whitehurst says, has been a dramatic change in the definition of a high-quality teacher. In the past, quality was defined by credentials such as a specific degree or certification. Now, he asserts, “it's about effectiveness in the classroom, measured by observations and measured by the ability of a teacher to increase test scores.” Whereas there is still a significant controversy over how to assess an individual teacher's effectiveness, Whitehurst believes that change in approach was driven by the research community, especially economists “who came to this topic because all of sudden there were resources—data resources and research support resources.”

Many researchers have complained that the IES's emphasis on randomized controlled trials has disregarded other potentially useful methodologies. Case studies of school districts, for example, could describe learning practices in action the way business schools use case studies of companies. “The current picture is really an ecosystem of methodologies, which makes sense because education is a complex phenomenon if ever there was one—complex in the scientific sense,” says Anthony Kelly, a professor of education psychology at George Mason University. Easton says he still believes randomized controlled trials are an important part of that process but not necessarily as “the culminating event.” He thinks trials might also be useful early in the process of developing an educational intervention to see whether something is working and worth more investigation.

From Lab to Classroom
Getting this new science into schools remains a challenge. “The thing with education research, as with many other fields, is that these are typically long trajectories of work,” says Joan Ferrini-Mundy, assistant director of the Directorate for Education and Human Resources at the nsf. “It is very unlikely that any single study in any short period will have an impact.” There is also a long-standing barrier between the lab and the classroom. In the past, many researchers felt it was not their job to find real-world applications for their work. And educators for the most part believed that the expertise they gained in the classroom generally trumped anything the researchers could tell them.

The What Works Clearinghouse was supposed to help bridge that gap, but in 2010 the General Accountability Office found that only 42 percent of school districts it surveyed had heard of it. The gao survey also found that only about 34 percent of districts had accessed the clearinghouse Web site at least once and that even fewer used it frequently. In an updated report in December 2013, the gao said dissemination remained problematic. The need is more urgent now, with the implementation of the Common Core state standards. Publishers are aggressively pushing curricula that claim to be aligned with the new standards, but district purchasing officers cannot just go to the clearinghouse and search for tested Common Core curricula. Instead they have to search for studies on the particular curricula they are considering—and not all of them are in the database.

Easton and others have acknowledged the need for a better pipeline to schools. As part of the solution, the clearinghouse has published 18 “practice guides” that lay out what is known about subjects such as teaching students who are learning English or teaching math to young children. Each is compiled by a panel that brings together researchers, teachers and school administrators. The practice guides may also direct future research, says psychology professor Sharon Carver, a member of the early math panel and director of Carnegie Mellon's Children's School. She urges her graduate students to read the guides that relate to their field and look for areas that need more exploration.

Each research question is an attempt to fit in another piece of a very large puzzle. “I don't think you can look at education from the point of view of whether it works or doesn't work, as if it's a lightbulb,” says Joseph Merlino, president of the 21st Century Partnership for STEM Education, a nonprofit in suburban Philadelphia. “I don't think human knowledge is like that.... In a mechanical age, we are used to thinking of things mechanically. Does it work? Can you fix it? I don't think you can fix education any more than you can fix your tomato plant. You cultivate it. You nurture it.”

Merlino's organization administered a five-year, IES-funded randomized controlled study of the effectiveness of applying four principles of cognitive science to middle school science instruction. A total of 180 schools in Pennsylvania and Arizona were randomly assigned modified or unmodified curricula. One part of the study was based on cognitive science research about how people learn from diagrams. Merlino says the researchers learned that some of the things that graphic artists might put into a diagram to make it jazzy—such as lots of colors—actually distract from learning. The researchers also found that students need instruction in reading diagrams. That is the kind of result that could be integrated into the design of a new textbook. Teachers could also take time to explain the meaning of different symbols in a diagram, such as arrows or cutaways.

Making educators an important part of the research process could also get results into the classroom. Teachers often feel that the expertise they have gained from their experience is ignored and that they instead get a new, supposedly evidence-based curriculum every few years without much explanation of why the new one is so much better than the old. And in the past, researchers have not generally felt that it was their role to explain their work to teachers. That is changing, says Nora Newcombe, a professor of psychology at Temple University and principal investigator of the Spatial Intelligence and Learning Center. “I think people are really waking up to the idea that if you take federal tax dollars, you are supposed to be sharing your knowledge.”

The exchange of knowledge can go both ways. In the Pennsylvania and Arizona science curriculum study, teachers were involved in the initial design of the experiments. “They were more like master teachers,” Newcombe says. “They taught, and they gave us feedback,” she adds. Because the study took place in actual schools rather than a lab, the researchers trained the classroom teachers as the work proceeded.

Other researchers point to the model of Finland, where educational theories, research methodologies and practice are all important parts of teacher education, according to Pasi Sahlberg, who in 2011 wrote Finnish Lessons, an account of how the country rebuilt its education system and rose to the top of international math and literacy rankings. In some ways, the comparison to American schools is unfair because Finland is a more homogeneous country. But Newcombe thinks that U.S. teacher training should include the most recent developments in cognitive science. In many teacher education programs, students “are taught a psychology that is not just 10 but more like 40 years out-of-date,” she says. That basic grounding could help teachers assess the importance of new research and find ways to incorporate it into their classrooms. “You can't really write a script for everything that happens in the classroom,” Newcombe says. “If you have some principles in your mind for what you do in those on-the-fly moments, you can do a better job.”