The solution involved a database sorting technique called federation, in which the system virtually pools data from various sites for a given analysis session. Once the analysis is complete, the researcher can download the analysis, but the pooled dataset disappears without being saved to the researcher’s computer or altered at any of the original sites, allaying ethical and data privacy concerns.
Another challenge is harmonizing the different datasets, says Ezra Susser, professor of epidemiology and psychiatry at Columbia University in New York, who played a key role in bringing together the iCARE team. “Even a simple variable like birth weight or gestational week can have different meanings across registries,” he says.
Although none of the analyses are published yet, iCare offers a much more finely grained look at the many variables than was previously possible, Schendel says. For example, studies of the effects of parental age on autism have been largely limited to broad age categories, but iCARE’s pooled analysis includes enough individuals to examine the risk of autism for particular maternal and paternal ages. The analysis is already uncovering surprising distributions of risk across certain ages, which would never have emerged from analyses using single databases, she says.
Although the group’s initial grant support is winding down, the infrastructure will remain operational for five more years.
Reichenberg is leading a new project built on the iCARE infrastructure that will include many of the same researchers. Funded last year with a network grant from the Autism Centers of Excellence program at the National Institutes of Health, that effort will take a multigenerational look at potential risk factors for autism, such as whether exposure to various medications during pregnancy is associated with autism in the child.
Participants of the collaboration say that once the publications start to flow, the research community will take notice. “One of the steps we need to take in epidemiology is what was done in genetics — to create these repositories so that you can combine samples and get much, much larger numbers,” says Susser. “It’s where the future needs to go.”