UCSC

Document Sample
UCSC Powered By Docstoc
					The UCSC (University of California
  Santa Cruz) Genome Browser
       “the golden path”
       genome.ucsc.edu




                                 1
                            Jim Kent
                           • First assembly of the human
                             genome as a graduate student
                             with his program GigAssembler
•   Catalog of software includes:
     • blat - Fast alignment of similar sequences.
     • autoSql - create SQL and C code for permanently storing a structure in
       database and loading it back into memory based on a specification file
     • ameme - Find motifs in DNA sequence.
     • 40 other command line programs for genome browser
     • The Intronerator - to look at C. elegans genes and splicing patterns.
     • cis-Site Seeker - Look for regulatory regions in RNA or DNA
       sequences

                                                                          2
UCSC Genome Gateway Structure
                                        Custom tracks
                           Genome
                           browser

                                            Table browser

Your                       Database         Gene sorter
             BLAT
sequence
           in silico PCR                    Proteome
                                            browser

            Downloadable data    Public MySQL       Your query
                  files             server


                                                            3
The UCSC Home page: genome.ucsc.edu

                             navigate

         navigate




                                    4
UCSC Genome Browser Gateway
   - start page, basic search




                                5
6
                                           Overview of the whole
                  }   Genome viewer
                         section
                                           Genome Browser page

                                Groups of data
                           Mapping and Sequencing Tracks

                           Genes and Gene Prediction Tracks

                           mRNA and EST Tracks
                           Expression and Regulation

                           Comparative Genomics

                           Variation and Repeats
                           ENCODE Regions and Genes
                           ENCODE Transcript Levels
                           ENCODE Chromatin Immunoprecipitation,
                           Chromosome, Chromatin and DNA Structure,
                           Variation
Lecture/Lab 7.2                                                 7
8
Configure Tracks – Spliced ESTs,
Microarray Expression, Repeats, etc




                                      9
                  Known
                  Genes




   Spliced ESTs
    By UCSC




Simple Repeats

                          10
                  Gene Description        “Known Gene”
                  Links to Tools/DBs      Details page for
                  UniProt Description       Clock gene
                  Links to output
                  Sequence


                  Microarray data




                  mRNA secondary structure

                  Protein domains/structure

                  Homologs
                  Gene Ontology ™ (GO)
                  mRNA descriptions
Lecture/Lab 7.2                                              11
                              pathways
Proteome Browser


                      Genome
                      Browser

                   Superfamily
                   Domin Db




                           12
Genome Gateway Help/User’s Guide




                                   13
BLAT – Blast Like Alignment Tool




                               14
In Silico PCR




                15
“Gene Sorter” and “Table Browser”

• Query database by filtering and cross
  references all of the data tables of the database
  to output sequence, genomic positions or text
  data.
• What are in all the tables?
  – genome.ucsc.edu/goldenPath/gbdDescriptions.html




                                                 16
                Gene Sorter
 • “display a sorted table of genes that are
   related to one another”




• EXAMPLE 1: Make a list of genes of
  membrane proteins that are highly expressed
  in pancreatic islet cells to possibly explore the
  role of autoimmunity in Type 1 Diabetes.

                                                  17
Gene Sorter - Configure




                          18
Gene Sorter - Filter




                       19
Gene Sorter - Output

           Sequence- genomic,   Text – Tab
              protein or mRNA    delimited




                                        20
         Gene Sorter - To Try Now
• EXAMPLE 2: Find genes expressed
  predominately in the mouse adrenal gland
  that have human „homologs‟. Get the
  sequence data and examine the expression
  of the human orthologs.
• Enter any gene to start.
• In configure menu: (a) Expand tissue selection of GNF Atlas 2 to
  “median of replicas”, (b) click on human homologs
• In filter menu: (a) set adrenal gland minimum box to 2.5, (b) look
  at results and set maximum box of other commonly expressed
  tissues to 0.5
• Complete solution in notes 7.2 UCSC.
                                                                  21
                        Table Browser


Groups as
in Browser

                                                                    Tracks within
                                                                    Group


                                 Filter fields in Table and connecting Tables
                      Intersect non-connecting Tables by position




                                           RESET!



    Lecture/Lab 7.2                                                                 22
Table Browser – table schema




                               23
     Table Browser – Example
• EXAMPLE 3: Find CpG islands in known
  genes on the last part of chromosome 22 of
  the human genome. Obtain the genes
  sequences as one fasta record per region.




                                   Change to


                                               24
    Table Browser – CpG Example




                      Set group for ‘Expression
                      and Regulation’ and track
Click on              for ‘CpG’ Islands
‘intersection’




                                          25
Table Browser – CpG Example




                              26
     Table Browser – CpG Example




Copy and paste
sequences or
Set up an ‘output file’ in
the Table Browser




                                   27
  Table Browser – Example To Try
• EXAMPLE 4: Find trinucleotide repeats of
  more than 10 copies within mRNA sequence
  on human chromosome 4. How many are
  there? How many are linked to known
  disease genes?
• Hints
   • Period = 3, copies > 10.
   • Intersect tables and custom track.
   • Tables: knownGene, simpleRepeats, spDisease




                                                   28
                 VisiGene
-in situ mRNA and protein images in mice and frogs




                                                     29
      Data Downloads
- from download link on homepage




             ...




                                   30
Example: simpleRepeats table




                               31
           Public MySQL Server
See the Data and Downloads FAQ:
           Direct MySQL access to data
http://genome.ucsc.edu/FAQ/FAQdownloads#download29


Command from local MySQL client:
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A




                                                          32

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:10/3/2011
language:English
pages:32