Special Technical Feature
MICROARRAY BASED DNA PROFILING
Queensland Institute of Medical Research, Herston, QLD 4029
*Corresponding author: email@example.com
Microarrays were initially developed to analyse gene reference DNA along the length of the chromosomes.
expression. The technology has since been adapted to a Although groundbreaking, traditional CGH is now
variety of applications, one of which is the analysis of the considered labour intensive with a poor resolution (~20 Mb)
genome to study DNA dosage and sequence content. unable to detect specific transcripts within areas of gain and
Advances in microarray fabrication and completion of large- loss across the genome that contribute to disease.
scale genome sequencing projects have enabled the rapid In the late 1990s, array CGH was developed that combines
development of array-based methods for genome-wide traditional CGH with microarrays and improved the
assessment of DNA alterations. resolution of copy number detection. During array CGH,
There are many different types of DNA changes that can DNA from test and reference populations are labelled and
occur, including DNA amplification or deletion of a single hybridised to microarrays that have been printed with either
base pair to thousands of base pairs, single nucleotide genomic fragments, such as bacterial artificial chromosome
substitutions or polymorphisms (SNPs), and loss of (BAC) clones from key genomic regions (2), or microarrays
heterozygosity (LOH). A broad spectrum of human diseases containing cDNA clones, which are also used for gene
is associated with genomic changes. Some of these diseases expression analysis (3). After hybridisation, the microarrays
have a complex genetic component in which a number of are scanned and the ratio of test to reference DNA is used to
transcripts or DNA alterations contribute to disease. calculate copy number variation between the samples. DNA
Identification of these genomic regions will improve our microarrays were then produced to span the human
understanding of the biology associated with these diseases genome and consisted of 2400 BAC clones, which
and is important in determining disease susceptibility. represented the sequenced human genome at an average
resolution of ~1.4 Mb (4). These were followed by tiling
resolution BAC microarrays containing >30,000 probes (5).
Comparative Genomic Hybridisation However, these BAC arrays still had disadvantages,
to Microarrays including the inability to detect SNPs or LOH.
Comparative genomic hybridisation (CGH) was the first The most recent type of array based tools for the analysis of
efficient approach to study DNA alterations and has been genomic changes are oligo array based techniques for
used to gain information regarding DNA copy number genotyping thousands of SNPs simultaneously (6). The data
gains and losses in a variety of diseases since the early 1990s from these oligo SNP arrays can then be manipulated to
(1). This technique uses labelled test and normal reference perform high resolution analysis of DNA copy number
DNA, which are hybridised to normal chromosome variation (Fig. 1) (7) and LOH (8). Recent advances in array
metaphase spreads. Regions of chromosome gain or loss technology and genome information from projects such as
caused by duplications, amplifications or deletions are HapMap has allowed the development of whole genome
detected as a ratio between the labelled test and normal SNP arrays which are discussed below.
Fig. 1. Example of how SNP
data can be used to study
copy number variation.
The genomic profile of
chromosome 8 is shown. The
upper plot shows the log ratio
where 0 = no change and the
lower plot the AF (allele
frequency). Two regions of
amplification are detected as
there is an increase in the log
ratio and loss of heterozygosity
in the allele frequency.
Vol 38 No 3 December 2007 AUSTRALIAN BIOCHEMIST Page 11
Special Technical Feature
The International HapMap Project This type of study has allowed the localisation and
SNPs are DNA variations that occur in single nucleotides identification of genes linked to diseases. Recently, a
and demonstrate significant heterozygosity in human number of excellent papers have been published that utilise
populations. SNPs may fall within coding regions of genes, whole genome SNP arrays to improve our understanding of
and in doing so may change the amino acid sequence of the many diseases, including lymphoblastic leukaemia (9),
protein, or in non-coding or intergenic regions. SNPs are the celiac disease (10), mental retardation (11), diabetes (12) and
most common form of genetic variation in the human breast cancer (13-15).
genome and there are thought to be about 10 million SNPs. Australian researchers have been involved in several whole
Alleles of SNPs that are close together tend to be inherited genome-wide association studies, two of which were
together in haplotype blocks. Thus assaying a single SNP, published this year in Nature and Science. The first, published
known as a tag SNP, will provide data on the neighbouring in Nature, was a genome-wide association study that
SNPs and will improve the power of association studies. The identified novel breast cancer susceptibility loci (13), while
International HapMap Project (www.hapmap.org) has aided the second, published in Science, analysed why humans
the development of whole genome SNP genotyping by show a variation in vulnerability to HIV-1 infection (16).
describing these common genetic variation patterns to enable
identification of a cohort of tag SNPs. More than 2.2 million
common SNPs in four ethnic groups have been genotyped as
Microarray-based approaches to study genomic
part of the International HapMap Project and it has been
abnormalities have significantly advanced in recent years. It
shown that about 300,000 tag SNPs will provide about 70%
is now possible to determine copy number variations, SNP
coverage of the genome in Caucasians and Asians.
genotyping and LOH at a high resolution. The development
of these robust, large-scale genome approaches utilising
SNP Genotyping Arrays SNP genotyping microarrays will help researchers gain
The microarrays currently available to the researcher insight into the biology of human diseases and will identify
allow for large-scale genome SNP genotyping, high many loci of disease susceptibility.
density copy number and LOH analysis. The resolution of
SNP arrays is increasing exponentially and the largest
arrays currently available enable genotyping of greater
1. Kallioniemi, A., Kallioniemi, O.P., Sudar, D., et al. (1992)
than 1 million SNPs in a single experiment. These arrays
Science 258, 818-821.
are the Affymetrix™ Genome-Wide Human SNP Array
2. Pinkel, D., Segraves, R., Sudar, D., et al. (1998) Nat.
(www.affymetrix.com) and the Illumina™ Human1M
Genet. 20, 207-211.
BeadChip (www.illumina.com). Both these arrays are
3. Pollack, J.R., Perou, C.M., Alizadeh, A.A., et al. (1999)
manufactured in a similar fashion to their gene expression
Nat. Genet. 23, 41-46.
counterparts, but contain oligo probes for genotyping. The
4. Snijders, A.M., Nowak, N., Segraves, R., et al. (2001) Nat.
assays are simple to run; for the Affymetrix arrays,
Genet. 29, 263-264.
genomic DNA is digested, ligated and amplified by PCR.
5. Ishkanian, A.S., Malloff, C.A., Watson, S.K., et al. (2004)
The DNA is then labelled and hybridised to the arrays and
Nat. Genet. 36, 299-303.
the fluorescence intensities are measured for each allele of
6. Kennedy, G.C., Matsuzaki, H., Dong, S., et al. (2003) Nat.
each SNP. In contrast, for Illumina BeadChips, the DNA is
Biotechnol. 21, 1233-1237.
amplified, fragmented and hybridised to the BeadChips,
7. Bignell, G.R., Huang, J., Greshock, J., et al. (2004) Genome
which contain allele-specific oligo probes. Once
Res. 14, 287-295.
hybridised, a single base extension of the oligos on the
8. Rauch, A., Ruschendorf, F., Huang, J., et al. (2004) Med.
BeadChip, using the hybridised DNA as a template,
Genet. 41, 916-922.
incorporates detectable labels on the BeadChip, which are
9. Mullighan, C.G., Goorha, S., Radtke, I., et al. (2007)
used to determine the genotype after scanning. About half
Nature 446, 758-764.
of the SNPs targeted by these arrays are tag SNP markers
10. van Heel, D.A., Franke, L., Hunt, K.A., et al. (2007) Nat.
derived from the human HapMap Project. While the
Genet. 39, 827-829.
remaining SNPs on the Affymetrix array are mainly
11. Friedman, J.M., Baross, A., Delaney, A.D., et al. (2006)
comprised of an unbiased selection of SNPs, those on the
Am. J. Hum. Genet. 79, 500-513.
Illumina arrays are focused on SNPs near or in genes. Both
12. The Wellcome Trust Case Control Consortium. (2007)
chips also contain a selection of copy number variant
Nature 447, 661-678.
probes that are spaced along the genome.
13. Easton, D.F., Pooley, K.A., Dunning, A.M., et al. (2007)
Nature 447, 1087-1093.
SNP Genotyping Array Studies 14. Hunter, D.J., Kraft, P., Jacobs, K.B., et al. (2007) Nat.
Genome-wide association studies utilise SNP arrays to Genet. 39, 870-874.
genotype genetic markers across case and control 15. Stacey, S.N., Manolescu, A., Sulem, P., et al. (2007) Nat.
populations. The regions in the genome that show statistical Genet. 39, 865-869.
significance potentially harbour a candidate disease locus. 16. Fellay, J., Shianna, K.V., Ge, D., et al. (2007) Science 317,
Page 12 AUSTRALIAN BIOCHEMIST Vol 38 No 3 December 2007