Exome Sequencing Deciphers Rare Diseases
Document Sample


Leading Edge
Analysis
Exome Sequencing Deciphers team has developed tools for quickly nar-
rowing down the flood of genomics data
to a handful of candidate genes for func-
Rare Diseases tional tests. With these new strategies,
the program has made 39 diagnoses,
including the team’s first reported
discovery of a new disease, which was
published last month in The New England
Two years ago, NIH’s Undiagnosed Diseases Program began
Journal of Medicine.
delivering genomics to the clinic on an unprecedented scale. Through the program, ‘‘scientists and
Now, with 128 exomes sequenced and 39 rare diseases diag- clinicians will get an understanding of
nosed, the program’s success is paving the way for widespread the difficulties involved in next-gen
personal genomics while pioneering new techniques for reigning sequencing, as well as the benefits; of
in the ‘‘tsunami’’ of genomics data. what cases are appropriate for genomic
study and of what filters should be applied
to the data, ‘‘ says William Gahl, the
In 2009, a healthy Colombian couple human genome—of a patient and his program’s director and clinical director
watched their two sons suffer from a family in only 9 weeks. In addition, of the National Human Genome Research
mysterious neurological illness, featuring high-resolution microarrays can genotype Institute.
seizures, tremors, and other complica- a million single-nucleotide polymor-
tions. When the younger son succumbed phisms (SNPs) for each family member, Family Matters First
to the illness at age 13, the family providing information about the remaining Before Carlos and his family arrived at the
decided to try a new approach. They 99% of the genome—the introns and non- NIH Clinical Center, their first major hurdle
enrolled their elder son, who will be coding regions. In this way, the Un- was acceptance into the program. The
referred to here as Carlos, in NIH’s Undi- diagnosed Diseases Program is leading NIH has received 4600 inquiries for the
agnosed Diseases Program, a trans- the charge in bringing genomics to the program and 1700 applications, but it
institute initiative dedicated to decipher- clinic. has enrolled only 380 cases. In general,
ing the cause of rare, mysterious health So why is this type of clinical genomics the program selects patients whose mal-
conditions. not routinely available outside the NIH? adies appear genetic in origin, based on
The Undiagnosed Diseases Program at Although the price of sequencing has their family history or a sign that their
NIH started in May 2008 as a pilot dropped to roughly $15,000 per genome symptoms have a shared underlying
program with initial funding of $280,000. or about $3000 for an exome, data cause.
Since then, it has grown to include 75 storage and the ensuing analysis remains Typically, the NIH team first attempts
physicians and scientists from almost too costly, too laborious, and frankly to diagnose illnesses by looking for
every institution at NIH, including endocri- too inefficient to put into common prac- known genetic markers and using
nologists, immunologists, oncologists, tice in the clinic. It’s cheap and fast to standard molecular and biochemical
and cardiologists. The team now works sequence, but to glean diagnostic infor- tests that are commercially available.
together on 300 cases with $3.5 million mation from the data is still quite laborious However, if they have no good leads on
in funding each year. and costly. the disease and therefore no candidate
What sets the program apart from other But recent successes by the Un- genes to search for, they’ll look to a
large clinical projects is their state-of-the- diagnosed Diseases Program are shifting patient’s whole exome. An entire exome
art genetic analyses. The team’s Illumina this trend. First, the program is learning yields an overload of information,
platforms can sequence the entire how to identify patients who might benefit however, and must be carefully edited.
exome—that is, the 180,000 exons in the most from genomic analysis. Second, the Exome sequences typically vary at about
Cell 144, March 4, 2011 ª2011 Elsevier Inc. 635
20,000 places, explains Thomas Mar- severely they alter the coding sequence. Thus far, the program has diagnosed
kello, a clinician involved with the For example, a mutation encoding a 39 cases—3 of which are neurological
program at NHGRI. This vast variation stop codon is considered more severe and muscular diseases discovered
makes it nearly impossible to find 1–2 than one with no effect on the resulting through exome analysis. Another 3
rare alleles underlying a mysterious amino acid. were discovered with SNP analysis,
disease. In other words, the nucleotide Another in-house software program whereas the remaining cases were
haystack is just too big. then eliminates mutations if their inheri- diagnosed with commercial tests,
After some trial-and-error, the team has tance pattern doesn’t match that of the including the identification of the rare
developed a strategy to quickly shrink the disease. For example, Carlos’s disease disease congenital disorder of glycosyla-
haystack: they eliminate many variants by appeared to follow a recessive mode of tion type 2B.
comparing the patient’s DNA with inheritance, and thus, his team kept only The Undiagnosed Diseases team also
genomic information from the family. the variants that behaved accordingly. counts concise lists of candidate muta-
This added information enhances the Finally, the team exports the results to tions as partial successes. These ‘‘short
signal-to-noise ratio and allows them to their program VarSifter, which lists the lists’’ are handed off to bench scientists
reduce the number of candidate genes. candidate variants in order from least to for functional studies on how the
‘‘Data reduction is pretty much every- most likely. mutations contribute to disease. Right
thing,’’ Markello says. Currently, some of this software is avail- now, these collaborations are currently
‘‘It’s simply not worth doing whole able to the scientific community, and ac- all within the NIH. But program leaders
exome sequencing on a single indi- cording to NHGRI’s Nancy Hansen, the say they’ll work with basic researchers
vidual,’’ says Gahl, ‘‘when the clinical entire suite may be released once it’s opti- outside of the NIH once they’ve
manifestations don’t point to a specific mized so that clinicians and scientists established a portal for collaboration.
group of genes.’’ With family information, might follow the lead of the Undiagnosed Marjan Huizing, a metabolic disorders
the team can apply classical pedigree Diseases Program. researcher at NHGRI, is not involved
tools reminiscent of population genetics with the program but says she’s ‘‘piggy-
in the 1970s to filter out mutations that Success Stories Accumulating backing’’ off their exomic techniques.
don’t follow predicted modes of Mende- Although the program’s strategy is still She studies endocytic trafficking defects
lian inheritance. And DNA from healthy evolving, this ‘‘whittling down’’ approach that lead to albinism, bleeding, and infec-
family members allows them to eliminate is producing results. At the end of 2010, tions.
harmless variations that run in the family the program identified the root of Carlos’s Traditional candidate gene approaches
but don’t lead to the disease. neurological symptoms, and a report on have frustrated her team. They spent
‘‘Family data makes a fundamental the teen’s diagnosis is pending publica- 12 years screening patients for candidate
difference,’’ says David Adams, an tion. And data from the Undiagnosed genes and never found the right one. So
NHGRI clinician who worked on Carlos’s Diseases Program team also helped she decided to follow in the program’s
case. ‘‘You don’t just need family history; Manfred Boehm at the National Heart, lead and analyze the exomes of three
we’ve learned you need family DNA to Lung and Blood Institute quickly locate patients. Right off the bat, she identified
succeed.’’ the mutation behind a mysterious the causal mutation underlying two of
So for Carlos’s case, the geneticists vascular calcification disorder. The team the cases. For the third, more difficult
sequenced his exome, as well as that handed him a list of 100 candidate genes, case, she’s planning to sequence the
of his parents and his deceased brother. and within 1 month he had pinpointed patient’s parents.
In general, the team gets 88% of mutations in the NT5E gene as the causal In the meantime, geneticists are
the complete exome with 99.9% confi- culprit. keeping an eye on the program for clues
dence. The exome data from Carlos NT5E encodes a membrane-bound about the nature of rare diseases and
and his family initially generated nucleotidase involved in extracellular whether sequencing alone can identify
120,000 variants for program geneticists ATP metabolism. Boehm’s study identi- ‘‘medically actionable alleles,’’ says Har-
to sift through. Remarkably, the team then fied in three families numerous mutations, vard genomicist George Church. ‘‘If that
narrowed this list to three candidate genes which destroy the activity of the nucleo- turns out to be true, it’s a paradigm shift.’’
using a series of software programs devel- tidase. By targeting this enzyme, clini- Although the clinicians at the Undiag-
oped internally at the NIH. cians may soon be able to treat a disease nosed Diseases Program echo Church’s
The geneticists first align the patient’s that didn’t even have a name a few curiosity, they keep the focus tightly on
exome with a reference sequence (typi- months ago. The study was published their patients. The thrill of a diagnosis
cally the one generated in the Human last month in The New England Journal through genomics is a triumph for
Genome Project) and exclude unlikely of Medicine. scientists and clinicians, but it means
candidate mutations according to a kill- In contrast, Boehm says, a couple of even more to patients and their loved
list of variants present in more than 1% years ago, NIH geneticists handed him a ones, who have sought explanations for
of exomes and genomes stored in data- list of 2000 genes potentially years.
bases from two separate projects, NIH’s underlying a known disorder, and a year Still, it is a bittersweet success. Carlos
ClinSeq and the 1000 Genomes Project. passed before he could identify the causal now has a name to call his malady, but
Mutations are then ranked by how mutation. there remains no cure or powerful drug
636 Cell 144, March 4, 2011 ª2011 Elsevier Inc.
to combat it. ‘‘Even when we get lucky, the the diagnosis to develop a treatment,’’ to this as the House of Hope, but from
best we can do is offer hope to the patient says John Gallin, director of the NIH Clin- the perspective of a care provider, I wish
that down the road, someone might use ical Center. ‘‘Our patients have referred I could do even more.’’
Amy Maxmen
New York, NY, USA
DOI 10.1016/j.cell.2011.02.033
Cell 144, March 4, 2011 ª2011 Elsevier Inc. 637
Get documents about "