To truly, deeply understand how the human body works—and how diseases arise—you would need an extraordinary amount of information. You would have to know the identity of every cell type in every tissue; exactly which genes, proteins and other molecules are active in each type; what processes control that activity; where the cells are located exactly; how the cells normally interact with one another; and what happens to the body’s functioning when genetic or other aspects of a cell undergo change, among other details. Building such a rich, complex knowledge base may seem impossible. And yet a broad international consortium of research groups has taken the first steps toward building exactly that. They call it the Human Cell Atlas.

The consortium had its inaugural planning meeting in October 2016 and continues to organize. The Chan Zuckerberg Initiative is onboard as well. In June 2017 it announced that it was providing financial and engineering support to build an open data-coordination platform to organize the findings, so they will be readily sharable by researchers in the project and beyond.

The atlas, which will combine information from existing and future research projects, has been made technically feasible by a host of technological achievements, including advances in tools for isolating individual cells, for profiling the proteins in a single cell at any given time (proteins are the major workhorses in the body), and for quickly and inexpensively sequencing DNA and RNA. It will integrate research exploring all the “omes”: the genome (the full set of genes), the transcriptome (the RNA made from the genes), the proteome (the proteins), the metabolome (small molecules, such as sugars, fatty acids and amino acids, involved or generated by cellular processes), and the fluxome (metabolic reactions whose rates can vary under different conditions). And these findings will be mapped to different subregions of cells. The integrated results should lead to a tool that will simulate all the types and states of cells in our body and provide new understandings of disease processes and how to intervene in them.

One of the most advanced projects underlying the Human Cell Atlas is the Human Protein Atlas, involving researchers from multiple countries, including Sweden, Denmark, South Korea, China and India. That project, which is continuously updated, offers a glimpse of the kind of work that goes into building the Human Cell Atlas and the value it will bring.

Participants in the Human Protein Atlas have classified a large majority of the protein-coding genes in humans using a combination of genomics, transcriptomics, proteomics and antibody-based profiling, which identifies location. Since the program’s inception in 2003, approximately 100 person-years of software development have gone into keeping track of the data and organizing them for systems-level analyses. More than 10 million images have been generated and annotated by certified pathologists, and the protein atlas includes a high-resolution map of the locations of more than 12,000 proteins in 30 subcellular compartments, or organelles, of various cells. All the findings are available to the research community with no restriction. Users can query the database to explore the proteins in any major organ or tissue, or they can focus on proteins with specific properties (such as those that participate in basic cell maintenance or that occur only in specific tissues). The data can also help to model the plethora of dynamic, interacting components that enable life and can be used to explore ideas for new therapies.

Completing the Human Cell Atlas will not be easy, but it will be an immeasurably valuable tool for improving and personalizing health care.