A staggering amount of information is conveyed in one patient’s medical records: vaccinations, allergies, records of mysterious aches and pains that were never linked to a cause. Then, of course, there are the lists of prescribed medications over the years. Even in the digital era today’s doctors typically record that medical hodgepodge in their own idiosyncratic or customized ways. And that’s even before genetic analyses entered the picture.

The situation has become even more opaque with the introduction of that new massive data set. Now a clutch of scientists and researchers are grappling with how to reconcile expanding, nonstandardized patient data and record-keeping ahead of the first stages of the White House’s Precision Medicine Initiative, an effort to offer more individualized care. The research group’s task during the past few months has been to prescribe how to draw together health information and DNA data points for one million patients and their medical records.

Few models exist for this audacious task. Still, the advisory group taking on this challenge is attempting to forge ahead with minds from the public sector, hospitals and also private organizations like the Bill & Melinda Gates Foundation and Google X. So what have they come up with?

On September 17 the group of almost two dozen individuals charged with drawing up the blueprints issued its final report for what that effort would look like. Its recommendations will help define what can be learned from this massive study and what success would look like five and 10 years out.

In their report the team suggests that one million participants could be recruited within about four years—either through individuals volunteering directly or with the help of health care providers like Kaiser Permanente who could ask patients to participate. The Precision Medicine Initiative could also draw from subject pools enrolled in prior studies with robust health records, provided they also enroll in this new data study. Taking steps to make sure the study sample is diverse—by race, sex, age and health status—will be a key priority to making this project work and getting strong data, the group notes.

The advisory group also fleshed out what volunteers could expect. Patients will have access to their own information via a central cloud as well as aggregate data. Volunteers signing up to be part of this project would agree to be recontacted, take a baseline health exam, share their electronic health care records and provide a biospecimen—most likely blood. Initially, the database hub would include information from electronic medical records, health insurance organizations, participant surveys, mobile health technologies and biologic investigations.

The report was scant on details about how to overcome significant hurdles regarding privacy, data storage and the research cohort itself beyond noting that the data should be centrally located and safeguarded against unintended release. One way to do that, the committee said, is to seek to request an exemption under the Freedom of Information Act for release of genomic and related data held by the federal government. The National Institutes of Health should also pursue legislation that would penalize unauthorized identification or contacting of participants. When it comes to sensor technologies used for this work to record health information about details on movement or sleep patterns, for example, the initiative should only work with companies that maintain strong security and privacy measures and agree not to see or use information generated from this cohort unless it is expressly agreed to by participants, they wrote.

Even with these recommendations, “there are a lot of things that still need further investigation,” NIH Director Francis Collins said after his advisory committee approved the report. In the short term, Josephine Briggs, the current director of NIH’s National Center for Complementary and Integrative Health will serve as the acting director of the precision medicine effort, charged with implementing these recommendations and overseeing patient recruitment that is slated to start in 2016.

Beyond these short-term obstacles researchers are seeking out a bold new world where patients consistently have the correct drug at the correct time for whatever ails them. Nevertheless, certain formidable realities about precision medicine also continue to go unaddressed. For one thing, it is currently difficult to find drugs tailored to an individual patient’s DNA and particular needs. Drug discovery is expensive, and spending millions of dollars to develop pharmaceuticals that may only help a tiny fraction of patients is, from a business perspective, often outside of the question.

Also, although the price for whole genome sequencing has dropped in the past decade from about $22 million to as little as $1,000 per human genome, there are still far, far more questions than answers about what a specific mutation may mean when it comes to disease. A mutation in a particular gene such as BRCA1 or BRCA2 may indicate an increased likelihood of developing breast or ovarian cancer but it is no guarantee for disease. There is also the training obstacle: Technicians must somehow achieve uniformity in reading and interpreting genetic sequence results.

Eventually, the report notes, the massive data hub could include self-reports on diet, details on medication use derived from insurance billing codes and even passively collected information gathered by wearable sensors and mobile phone apps.

The long-term goal of the president’s initiative is to shore up a massive health care database that researchers can harvest again and again for trends on medical information, genome analysis and other biomedical data as specific as the composition of our microbiomes. With this new blueprint from the advisory committee, supporters of the Precision Medicine Initiative are at least getting their first glimpses of how tailored medicine truly aspires to become and how much is left to learn.