Quality control of genotype and DNA handling

Reviews
Shared by: sammyc2007
Categories
Stats
views:
113
rating:
not rated
reviews:
0
posted:
3/29/2008
language:
pages:
0
Genome-Wide Association Studies in Breast and Prostate Cancer: Cancer Genetic Markers of Susceptibility (CGEMS) Stephen Chanock, M.D. Chief of Laboratory of Translational Genomics, DCEG, NCI Director, Core Genotyping Facility, DCEG, NCI October 20, 2007 Identifying Genetic Markers for Prostate & Breast Cancer Genome-Wide Analysis Public Health Problem Prostate (1 in 8 Men) Breast (1 in 9 Women) Analyze Long-Term Studies NCI PLCO Study Nurses’ Health Study Initial Study Follow-up #1 Follow-up #2 Establish Loci Fine Mapping Functional Studies Validate Plausible Variants Possible Clinical Testing http://cgems.cancer.gov CGEMS: Nurses Health Study Longitudinal Study of 121,700 women enrolled in 1976 CGEMS Case-Control derived from 32,826 participants who provided blood sample between 1989 & 1990 Followed for incident disease until May 2004 Post-menopausal invasive breast cancer Capture rate estimated to be 90% Controls matched on age, blood collection, ethnicity (self-described Caucasians) and use of hormones Final Selection for Association Analysis Starting Sample Set: 2,494 1,183 cases & 1,185 controls 93 dups, 5 triplicates 23 QC samples Repeat if <94% Su mm ary of sele ctio n o f cases and controls cases Initiall y attempted 1,183 - l ow completion rate 30 - u ncle ar identit y 5 - a dmi xed ori gin 3 = Used in scan 1,145 for association analysis controls 1,185 29 13 1 1142 Fingerprints: 1. IdentifilerTM Kit is a multiplex assay that amplifies 15 STR markers and the amelogenin locus for gender determination Robust PCR reaction requires less than 1 ng of DNA Matching probabilities that exceed one in one billion Experience with nearly 200,000 samples at CGF 2. Check with BPC3 SNPs Dye Marker D8S1179 D21S11 D7S820 CSF1PO D3S1358 TH01 D13S317 D16S539 D2S1338 D19S433 vWA TPOX D18S51 D5S818 PET Chromosome 8 21q11.2-q21 7q11.21-22 5q33.3-34 3p 11p15.5 13q22-31 16q24-qter 2q35-37.1 19q12-13.1 12p12-pter 2p23-2per 18q21.3 5q21-31 Possible Alleles 8, 9 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 24, 24.2, 25, 26, 27, 28, 28.2, 29, 29.2, 30, 30.2, 31, 31.2, 32, 32.2, 33, 33.2, 34, 34.2, 35, 35.2, 36, 37, 38 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 12, 13, 14, 15, 16, 17, 18, 19 4, 5, 6, 7, 8, 9, 9.3, 10, 11, 13.3 8, 9, 10, 11, 12, 13, 14, 15 5, 8, 9, 10, 11, 12, 13, 14, 15 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 9, 10, 11, 12, 12.2, 13, 13.2, 14, 14.2, 15, 15.2, 16, 16.2, 17, 17.2 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 6, 7, 8, 9, 10, 11, 12, 13 7, 9, 10, 10.2, 11, 12, 13, 13.2, 14, 14.2, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 26.2, 27, 28, 29, 30, 30.2, 31.2, 32.2, 33.2, 42.2, 43.2, 44.2, 45.2, 46.2, 47.2, 48.2, 50.2, 51.2 X Y SIZE STANDARD 6-FAM VIC NED FGA 4q28 Amelogenin X p22.1-22.3 : Y: p11.2 LIZ SNP Call Rates in NHS Attempted Failed (no or <90% call rate) MAP < 1% Analysis SNPs 555,352 -8,706 -18,473 528,173 (95.1%) Table 1. Summar y of C ompletion Sample Completion rate for Scan 1 stud y Scan 1 case Sca n1 control rate for N HS samples 528,173 SNPs (r etaine d) 99.754 % 99.756 % 99.773 % 546,646 SNPs (attempted) 95.754 % 95.704 % 95.799 % Triangular plot for admixture vectors of cases and controls in NHS using STRUCTURE program Based on 2 sets of SNPs 7,050 & 7,061 with r2< 0.01 Asia Europe Africa Assessment of Discordance Rates Participants 142 duplicate pairs Mean discordance rate prostate 2.0 10-4 breast 1.5 10-4 CEPH-CGEMS 74 duplicate pairs CEPHHapMap 28 individuals (with 24 duplicates) Mean discordance rate 1.4 10-3 Mean discordance rate 2 10-4 Concordance Rates: 99.985% NHS: 50,820,003 of 50,827,468 comparisons Criterium SNP call rate > 90% Overall SNP & DNA Success Rates: PLCO & NHS Criterium DNA completion rate > 94% Number of SNPs attempted failed PLCO* NHS Success Rate 561,494 555,352 1,490 8,706 0.973 0.984 2.1% Attempted SNPs SNPx Number of individuals attempted 4696 failed 66 success rate 0.986 Attempted DNAs Working set of data Proportion of missing data 0.002 DNA y 1.4% NHS GWAS Deviation from Hardy-Weinberg Proportions Figure 1 . lo g scale p-v alue quantile 0 plot for devi ation from Hard y-W einb erg proportion log 10 (p -v alue) -2 -4 -6 -8 -2 -1 log 10 (quantile) 0 p<0.05: 29,318 (5.55%) p<0.001: 2,880 (0.55%) ~38% map to CNV regions CGEMS Breast Cancer Scan in NHS log quantile plot of p-values for all SNPs 0 log10(p-value) -2 -4 corrected uncorrected -6 -6 -4 -2 0 log10(quantile) 528,173 SNPs Log quantile plot of p-values for the 550 SNPs with lowest p values (0.1%) Breast cancer GWA scan Prostate cancer GWA scan log10(p-value) -4 -4 -5 -5 -6 corrected uncorrected -6 corrected uncorrected -7 -6 -5 -4 -7 -6 -5 -4 log10(quantile) -3 log10(p-value) -3 -3 log10(quantile) -3 Incidence density sampling pter -2 GWAS in NHS Breast Cancer qter -3 -4 Fig1 a -5 TLR6 (38.5M) rs12505080 (37M) RELN -6 -2 1 2 3 4 5 6 7 8 log10(p-value) -3 -4 -5 rs10510126 (125M) FGFR2 (123M) ATP10A rs11150911 -6 9 10 11 12 14 13 15 16 17 18 19 20 21 22 Heterogeneity in Signal: BCAC & CGEMS (NHS) BCAC*- best hits SNP rs1219648 rs889312 rs3817198 rs2107425 rs13281615 rs981782 rs30099 rs4666451 rs3803662 GENE FGFR2 MAP3K1 LSP1 H19 CHR 10 5 11 11 8 5 5 2 16 CGEMS TYPED IN CGEMS? 2.00E-06 no 0.51 no no no 0.88 no 0.05 BEST IN 100kb REGION rs726501 rs217228 rs10098985 rs4866929 rs12710697 p 0.012 0.030 0.017 7.30E-05 0.046 RANK IN SCAN 1 6,340 269,442 14,344 12,048 33 462,781 24,741 27,025 TNRC9/LOC643714 Easton Nature 2007 3 Stage Design >28,000 cases/26,000 controls http://cgems.cancer.gov 1145 cases/1142 controls Figure 2 FGFR2 SNPs in NHS GWAS: Intron 2 123.2 123.3 123.4 log10(p-value) -4 -2 0 FGFR2 Pooled: p 1.1 x 10-10 Cases/controls 2,921/3,214 Hunter et al Nat Genet 2007 General Strategy for Breast Cancer GWAS Initial Study 1150 cases/1150 controls 540,000 Tag SNPs >33,000 SNPs Follow-up Study #1 4000 cases/ 4000 controls Follow-up Study #2 5000 cases/ 5000 controls at least 7,600 SNPs Fine Mapping 10 ±5 loci http://cgems.cancer.gov Two Strategies for Staged Follow-up 550k GWAS 550k GWAS Cone of Truth 28000 F/U #1 28000 F/U #1 150 F/U #2 150 F/U #2 Truth Aggressive Prostate Cancer High priority to examine non-aggressive vs aggressive Cohort based studies (screening) • Bias towards early cases Enrich primary scan with >55% aggressive cases • Aggressive defined as: • Gleason score ≥ 7 OR Stage C/D • Follow-up studies in cohorts • Comparable distributions for early/advanced Approximately same ratio overall for follow-up studies Multiple potential genetic models Simple process No cancer Predisposition gene cancer Independent processes Non-aggressive specific gene Multistep process Initiation gene progression gene non aggressive cancer No cancer Aggressive specific gene No cancer Non-aggressive cancer Accelerator gene Aggressive cancer Aggressive cancer For each SNP type, the mode of expression may be recessive, dominant, additive, multiplicative and even overdominant . Inclusion of PLCO prostate cancer patients 1994 1996 1998 2000 2002 0 0 Aggressive Cancer Non-aggressive Cancer 737 Oct 2003 624 Oct 2001 1994 28,521 eligible participants Matching with controls was performed for 737 aggressive cases and 493 randomly selected non-aggressive cases. Non aggressive : stage <= 2 (non invasive) and Gleason score <= 6 Aggressive : stage > 2 (invasive) or Gleason score > 6 Log-Log Quantile Plot for p-values for the 4 Statistical Tests Used -3 Log10 (p value) Sing. Sampl. No cov -4 Incid. Den. Sampl. No cov Sing. Sampl. with cov -5 Incid. Den. Sampl. with cov -6 http://cgems.cancer.gov -5 -4 Log10 (quantile) -3 CGEMS Prostate Cancer GWS Chromosomes p 1 -2 q 2 3 4 5 6 7 p 8 q -4 8q24 -6 Log10(p-value) -2 p 9 q 10 11 12 13 14 15 16 17 18 19 20 21 22 p X q -4 -6 incidence density sampling General Strategy for Prostate GWAS Initial Study 1150 cases/1150 controls 540,000 Tag SNPs >28,000 SNPs PLCO ACS/ATBC/ HPFS/FrCC/ PHS MEC/EPIC/ JHU/SwCaP Follow-up Study #1 3700 cases/ 3900 controls Follow-up Study #2 5500 cases/ 5500 controls at least 7,600 SNPs Fine Mapping 10 ±5 loci Genotype, Haplotype, Sequence Determine Causal Variant(s) Breakdown of Agnostic 26,890 SNPs in Prostate Cancer Follow-up #1 Single SNP Analysis- Identify “best” 30,000 (i.e. lowest p values) Incidence Density Sampling Covariates (age in 5 years and center) Score test (4 df) Two-SNP analysis Stratified for each notable SNP Inclusion of those with improved p values FILTER: Selection was sequential such that any SNP with r2>0.8 with a selected SNP was not selected. Tally Single SNP = 24,988 with α < 0.063 Two SNP = 1,902 (7.6%) iSelect Composition for Prostate Scan Agnostic 1 SNP Agnostic 2 SNP Population Stratification Monitors 8q24 Illumina 550 SNPs + Candidate Coverage Of 28,880 bead types designed Approximately 1300 (4.5%) Failed Design and/or Manufacturing 86.5% 6.6% 5.2% 0.5% 1.2% Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men Freedman et al. PNAS 2006;103:14068-73. DECODE: rs1447295 + microsatellite Amundadottir et al. Nat Genet. 2006;38:652-58. Prostate Cancer & 8q24 Rs1447295 replication in BPC3 & 3 Other Studies Commonly Amplified in Prostate tumors “Gene-poor region” GWAS- multiple signals Yeager et al Nat Genet 39:645-649, 2007 -log10 r2 Replication Studies in CGEMS Prostate Cancer GWAS rs6983267 Subjects PLCO ACS ATBC FPCC HPFS 1157 1151 896 459 636 1172 1150 894 455 625 Predisposing allele frequency Cases 0.55 0.55 0.57 0.56 0.57 Cont. 0.49 0.50 0.51 0.51 0.51 2.4x10-05 3.2x10-03 1.9x10-03 1.2x10-01 1.0x10-02 P-value rs1447295 Predisposing allele frequency Cases Cont. P-value 0.14 0.12 0.21 0.12 0.13 0.15 0.10 0.08 0.17 0.07 0.09 0.11 9.8x10-05 2.7x10-05 2.9x10-02 4.4x10-03 2.7x10-03 1.5x10-14 ALL 4299 4296 0.56 0.50 9.4x10-13 Estimated Odds Ratios Overall Heterozygotes 1.26 Homozygotes 1.58 1.43 2.23 Ancestral Recombination Graph Analysis* of the 8q24 region identifying independent regions of association flanking a hotspot of recombination. Minichiello & Durbin AJHG 2006 Population Attributable Risk of Prostate Cancer with 8q24 Loci in Caucasians AL L AC S AT B C FPCC HPFS PLCO J o in t P AR 0 .2 8 4 0 .2 5 5 0 .2 5 1 0 .3 0 6 0 .2 4 9 0 .3 4 7 P AR rs 1 4 4 7 2 9 5 0 .0 8 5 0 .0 9 4 0 .0 5 2 0 .0 9 6 0 .0 8 5 0 .0 8 6 P AR rs 6 9 8 3 2 6 7 0 .2 0 9 0 .1 9 2 0 .1 5 7 0 .0 9 1 0 .1 8 0 0 .2 7 6 rs6983267 G: 21% rs1447295 A: 7% •Suggests that both SNPs contribute substantially to the population burden of prostate cancer. Follow-up to GWAS Studies Fine Mapping of Notable Regions Genotyping & Sequencing Bio-informatics (exclude common CNV) Analysis of Population Genetics Functional Determination of Causal Variant(s) Design Issue for Analysis in Clinical Studies Population-based studies Sequence of Clinical Studies Validation in Follow-up Studies Clinical Implementation CGEMS: caBIG Posting Pre-Computed Analysis Pre-computed Analysis No Restrictions Raw Genotype Case/control Age (in 5 yrs) Family Hx (+/-) Registered Access SF424 Data Use Certificate http://cgems.cancer.gov/data Access to CGEMS Analyses through caBIG Portal Pre-computed Analyses with Methods PDF Prostate 1A Scan Prostate 1 Scan Breast 1 Scan 300,000 SNPs 530,000 SNPs 530,000 SNPs Oct 2006 Feb 2007 April 2007 Registered Access (individual and institutional access) Signed SF424 (modified) with Abstract Data Use Certificate Association Tests 8q24 Scan 1A ~300,000 SNPs http://cgems.cancer.gov Available 10/06 Committed Studies CGEMS Prostate Cancer PLCO (GWAS) ACS HPFS PHS ATBC CeRePP EPIC MEC Breast Cancer NHS (GWAS) PLCO WHI Polish C/C ACS EPIC MEC Acknowledgements CGEMS & DCEG Gilles Thomas Kevin Jacobs Meredith Yeager Robert Hoover Joseph Fraumeni Daniela Gerhard Zhaoming Wang Xiang Deng Nick Orr Robert Welch Richard Hayes Sholom Wacholder Nilanjan Chatterjee Kai Yu Margaret Tucker Marianne Rivera-Silva HSPH David Hunter Peter Kraft David Cox Sue Hankinson CeRePP, France Olivier Cussenot Geraldine Cancel-Tassin Antoine Valeri ACS Michael Thun Heather Feigelson Eugenia Calle Wellcome Trust, Mark Minichiello UK NPHI, Finland Jarmo Virtamo NCICB Ken Buetow Carl Schaefer Subhah Madhavan Liming Yang Wash. U., St Louis Gerald Andriole

Related docs
Saving-Lives-One-Genotype-at-a-Time
Views: 3  |  Downloads: 0
DNA
Views: 79  |  Downloads: 11
DNA FINGERPRINTING AND SOCIETY
Views: 175  |  Downloads: 3
Access Control Card Handling Guide
Views: 2  |  Downloads: 0
moving and handling
Views: 4  |  Downloads: 0
FIND Quality Control Procedures
Views: 4  |  Downloads: 0
Reference Guide on Forensic DNA Evidence
Views: 0  |  Downloads: 0
Collection and Preservation of DNA Evidence
Views: 2  |  Downloads: 0
DNA Template Quality
Views: 22  |  Downloads: 3
HANDLING INSTRUCTIONS
Views: 4  |  Downloads: 0
Motion to Exclude DNA Evidence
Views: 0  |  Downloads: 0
premium docs
Other docs by sammyc2007