Researchers are getting closer to understanding how complex DNA variations cause disease.
A new report suggests that an innovative method of analyzing DNA will allow medical researchers to predict disease risk for far more people than can be done with conventional medical tests.

A DNA nucleotide sequence is pictured above. (Aysunbk/iStock/Getty Images)

Figuring out the elements that make up our genomes — and identifying the ones that cause disease — is kind of like beaming an alien to Earth and asking him to explain how a piano works. The alien might start with some obvious deductions, like noticing that the black keys are different from the white keys. After exploring the outside features, the alien would find all sorts of new things to consider when he opened the piano and saw all the wires.
Exploring genomes has been a similar experience. At first, scientists noticed groupings of DNA and realized that they functioned as units or genes. Later, they found that single pieces of DNA within those genes might vary, and those little changes could cause disease. Despite everything we’ve learned about how genomes work, we still haven’t even opened the piano — there are many surprises yet to come about the DNA encoded in every living thing on this planet.
And that’s why the recent paper is so important. For years, scientists have tried to link rare and common diseases back to the individual genes that cause them. And since there are many diseases caused by a single gene, that approach has been effective in some cases. But diseases that are more common and complex, such as diabetes, have resisted a straightforward genetic explanation.
Now scientists at Harvard and a leading genome center in Cambridge, Mass., have come up with a new method that can calculate genetic risk for developing certain common diseases based not on single genes — the norm so far — but on multiple genes acting in concert. That’s a lot more challenging than it sounds. For one thing, analyzing data for combinations of genes is exponentially more computationally intensive than for single genes. For another, scoring risk predictions based on many factors requires a different approach from what’s been the standard in genome analysis. Researchers had to develop new algorithms to mine large troves of genomic data, analyze the information, and pick out clinically relevant signals linked to many different genes.
The work appears to have paid off. The team looked at several diseases, including breast cancer, coronary artery disease, and Type 2 diabetes. For each condition, they began with a collection of 400,000 people and analyzed data from millions of locations across each person’s genome. The multifactor risk predictions that emerged from this approach may be able to pinpoint millions of people at risk for developing these diseases with far greater accuracy than current single-factor genetic tests or other medical risk assessments, paving the path to early intervention or prevention protocols.
According to the scientists involved, these multigene risk scores could be predictive even for people who show no other warning signs for these diseases.
“If they came into my clinical practice, I wouldn’t be able to pick them out as high risk with our standard metrics,” said Massachusetts General Hospital cardiologist Amit Khera, one of the scientists in the study, in a statement about patients with high risk scores for coronary artery disease. “There’s a real need to identify these cases so we can target screening and treatments more effectively, and this approach gives us a potential way forward.”
Read the full paper, “Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations,” from Amit Khera et al.