Docstoc

Supplementary Table 1 - Nature Publishing Group science

Document Sample
Supplementary Table 1 - Nature Publishing Group  science Powered By Docstoc
					Supplementary Table 1: Pseudogene candidates in the human genome


                                                                         Transcript
                                                                         Sequence                           Resequencing Resequencing
   Mouse Ortholog          Rat Ortholog (Ensembl                        Evidence for       Disruptions in     Evidence:     Evidence:
   (Ensembl gene)                  gene)                 Chr.    Strand Disruption            Human            Human         Chimp      NCBI34 start   NCBI34 stop
ENSMUSG00000028558         ENSRNOG00000010585              1        -        no                  5           Pseudogene    Pseudogene     51386816       51409646
ENSMUSG00000033794         ENSRNOG00000002071              1        +        no                 8§           Pseudogene    Pseudogene     92003670       92005119
ENSMUSG00000038237         ENSRNOG00000017664              1        +        no                 3§           Pseudogene    Pseudogene    144397851      144398686
ENSMUSG00000039925         ENSRNOG00000003507              1        -        no                 1§           Pseudogene    Pseudogene    155995011      155995917
ENSMUSG00000020600         ENSRNOG00000006119              2        +        no                  2           Pseudogene    Pseudogene     20570700       20571334
ENSMUSG00000032491         ENSRNOG00000020936              3        +       yes                 1§           Pseudogene    Pseudogene     47014387       47015231
ENSMUSG00000034978         ENSRNOG00000001668              3        +        no                 1§           Pseudogene       Gene        99351667       99352567
ENSMUSG00000038624         ENSRNOG00000000410              6        +        no                  4           Pseudogene    Pseudogene    118000469      118011792
ENSMUSG00000020010         ENSRNOG00000016219              6        -       yes                  1           Pseudogene       Gene       133024638      133036401
ENSMUSG00000036586         ENSRNOG00000001251              7        -        no                  1               Gene         Gene        2259113        2259968
ENSMUSG00000039651         ENSRNOG00000011397              8        +        no                 4§           Pseudogene    Pseudogene     17337366       17339379
ENSMUSG00000034188         ENSRNOG00000013753              8        -        no                  2           Pseudogene    Pseudogene    145446126      145448672
ENSMUSG00000028314         ENSRNOG00000005721              9        +        no                 3§           Pseudogene    Pseudogene    102417423      102419452
ENSMUSG00000042162         ENSRNOG00000013367             10        +        no                 1§           Pseudogene       Gene        45037648       45038574
ENSMUSG00000042597         ENSRNOG00000018575             11        -        no                 2§           Pseudogene    Pseudogene     4646380        4647326
ENSMUSG00000037195         ENSRNOG00000017387             11        -       yes                 2§           Pseudogene    Pseudogene     6137158        6138097
ENSMUSG00000016370         ENSRNOG00000013437             11        -        no                 2§           Pseudogene    Pseudogene     32097908       32216210
ENSMUSG00000041318         ENSRNOG00000021049             11        +        no                 2§           Pseudogene    Pseudogene     59291647       59292407
ENSMUSG00000024871         ENSRNOG00000018029             11        -        no                 1§           Pseudogene       Gene        67155671       67159052
ENSMUSG00000025887         ENSRNOG00000007702             11        -       yes                  2           Pseudogene    Pseudogene    104295161      104305886
ENSMUSG00000030351         ENSRNOG00000005188             12        -       yes                  3           Pseudogene    Pseudogene     3283215        3316629
ENSMUSG00000029658         ENSRNOG00000000904             13        +        no                  2           Pseudogene    Pseudogene     29471237       29487838
ENSMUSG00000010398         ENSRNOG00000009703             14        +        no                 2§           Pseudogene    Pseudogene     21161480       21162402
ENSMUSG00000035148         ENSRNOG00000006905             14        -        no                 1§            Allelic null    Gene        29942212       29943210
ENSMUSG00000021019         ENSRNOG00000006986             14        -       yes                  2           Pseudogene    Pseudogene     33414125       33428564
ENSMUSG00000030543         ENSRNOG00000014925             15        +        no                  2           Pseudogene    Pseudogene     88049357       88051330
ENSMUSG00000018570         ENSRNOG00000015975             17        +        no                 1§           Pseudogene    Pseudogene     7434336        7438637
ENSMUSG00000027444         ENSRNOG00000004858             20        +        no                  1           Pseudogene    Pseudogene     23451346       23455435
ENSMUSG00000039041         ENSRNOG00000006991             20        +       yes                  1           Pseudogene    Pseudogene     61517030       61522219
ENSMUSG00000005899         ENSRNOG00000001875             22        -       yes                 13           Pseudogene    Pseudogene     19281453       19304975
ENSMUSG00000031210         ENSRNOG00000012995              X        +        no                  1           Pseudogene    Pseudogene     64524466       64527302
ENSMUSG00000018595         ENSRNOG00000002391              X        -        no                 1§           Pseudogene       Gene       101734012      101751761
ENSMUSG00000036179         ENSRNOG00000007593              X        +        no                 4§           Pseudogene    Pseudogene    129043980      129044562
ENSMUSG00000023309         ENSRNOG00000007868              X        -        no                 4§           Pseudogene    Pseudogene    129237670      129238569

Evidence supporting 34 candidate pseudogenes discussed in the main text. Evidence includes presence
of mouse and rat orthologues, whether there is evidence for a mRNA or EST transcript with a disrupted
frame (here, "no" implies that there is no such evidence available), and the number of disruptions of the
gene in the human orthologous sequence (where § indicates disruptive elements that occur only in the
final or only exon). Conclusions derived from resequencing studies for these being pseudogenes or
genes in human and chimpanzee are also shown. The pseudogene prediction on chromosome 7 was
shown, from resequencing evidence, to be a gene, and represents the only instance of an inaccurate
prediction arising from sequence error in the human genome assembly. Abbreviations: Chr., human
chromosome; %COV, the proportion of the mouse orthologous sequence aligned with the human
sequence; and, %ID, the percentage sequence identity of this alignment.
Evidence supporting 34 candidate pseudogenes discussed in the main text. Evidence includes presence
of mouse and rat orthologues, whether there is evidence for a mRNA or EST transcript with a disrupted
frame (here, "no" implies that there is no such evidence available), and the number of disruptions of the
gene in the human orthologous sequence (where § indicates disruptive elements that occur only in the
final or only exon). Conclusions derived from resequencing studies for these being pseudogenes or
genes in human and chimpanzee are also shown. The pseudogene prediction on chromosome 7 was
shown, from resequencing evidence, to be a gene, and represents the only instance of an inaccurate
prediction arising from sequence error in the human genome assembly. Abbreviations: Chr., human
chromosome; %COV, the proportion of the mouse orthologous sequence aligned with the human
sequence; and, %ID, the percentage sequence identity of this alignment.
                             Annotation                               %Identity %Coverage
calreticulin                                                            73.2      100
phospholipid/glycerol acyltransferase family member                     62.9      100
olfactory receptor                                                      62.8      95.6
olfactory receptor                                                      75.9      97.1
amino acid transporter                                                  61.4      78.8
p75-like apoptosis-inducing death domain protein PLAIDD                 70.2      100
olfactory receptor                                                      71.3      99.7
similar to glycoprotein hormone receptor                                79.5      84.6
vascular non-inflammatory molecule vanin                                81.6      98.8
GRIFIN, galectin-related inter-fiber protein; lens-specific protein     77.8      99.3
disintegrin and metalloprotease domain 25; testase 2                    53.3      100
serine/threonine kinase 22B (spermiogenesis associated)                 66.8      100
p53-binding protein-3/topoisomerase 1-binding RING finger               48.6      100
olfactory receptor                                                      70.6      96.6
olfactory receptor                                                      74.1      99.7
olfactory receptor                                                      85.2      99.4
thioesterase                                                            69.8       99
olfactory receptor                                                      79.8      84.1
double C2 gamma                                                         82.5      97.2
caspase 12                                                              55.4      98.3
similar to platelet-endothelial tetraspan antigen 3 (PETA-3)            64.9      87.8
similarity to U5 snRNP-specific 40 kDa protein                          55.3      82.8
olfactory receptor                                                      79.8      98.1
G protein-coupled receptor 33                                           71.8      98.2
possible SAM-dependent methyltransferase                                67.5       62
mesoderm posterior 2, involved in somite segmentation                   63.6      100
protein phosphatase inhibitor 2                                         47.2      100
Sertoli cell cystatin                                                   65.3      95.4
membrane glycoprotein                                                   93.9      100
unknown                                                                  68       100
G protein-coupled (possible glucocorticoid-induced) receptor             69       100
glycine receptor subunit                                                95.4      100
olfactory receptor                                                      41.6      96.1
olfactory receptor                                                      74.3      96.8

                                                                        69.4      95.5

				
DOCUMENT INFO