Using Web-Services NCBI EUtils, online BLAST by vev19514

VIEWS: 0 PAGES: 19

									      Using Web-Services:
             NCBI EUtils,
            online BLAST
                                  BCHB524
                                       2009
                                  Lecture 14



11/02/2009    BCHB524 - 2009 - Edwards
Outline
    NCBI EUtils
            …from a script, via the internet


    NCBI Blast
            …from a script, via the internet


    Exercises



11/02/2009                      BCHB524 - 2009 - Edwards   2
NCBI Entrez
    Powerful web-
     portal for NCBI's
     online databases
            Nucleotide
            Protein
            PubMed
            Gene
            Structure
            Taxonomy
            OMIM
            etc…


11/02/2009                BCHB524 - 2009 - Edwards   3
NCBI Entrez
    We can do a lot using a web-browser
            Look up a specific record
                nucleotide, protein, mRNA, EST, PubMed, structure,…
            Search for matches to a gene or disease name
            Download sequence and other data associated
             with a nucleotide or protein
    Sometimes we need to automate the process
            Use Entrez to select and return the items of
             interest, rather than download, parse, and select.
11/02/2009                       BCHB524 - 2009 - Edwards          4
NCBI EUtils
    Used to automate the use of Entrez capabilities.
    Google: Entrez Programming Utilities
            http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
    See also, Chapter 8 of the BioPython tutorial
    Play nice with the Entrez resources!
            At most 100 requests during the day
            Supply your email address
            Use history for large requests
            …otherwise you or your computer could be banned!
            Biopython automates many of the requirements...
11/02/2009                            BCHB524 - 2009 - Edwards                  5
NCBI EUtils
    No need to
     use Python,
     BioPython
    Can form urls
     and parse
     XML directly.
    EInfo
    PubMed Info


11/02/2009           BCHB524 - 2009 - Edwards   6
BioPython and Entrez EUtils
    As you might expect BioPython provides
     some nice tools to simplify this process




11/02/2009            BCHB524 - 2009 - Edwards   7
BioPython and Entrez EUtils
    "Thin" wrapper around EUtils web-services
            Use EUtils argument names
                db for database name, for example
    Use Entrez.read to make a simple dictionary
     from the XML results.
            Could also parse XML directly (ElementTree), or
             get results in genbank format (for sequence)
    Use result.keys() to "discover" structure of
     returned results.
11/02/2009                       BCHB524 - 2009 - Edwards      8
EUtils Web-Services
    EInfo
            Discover database names and fields
    ESearch
            Search within a particular database
            Returns "primary ids"
    EFetch
            Download database entries
    Others:
            ELink, EPost, ESummary, EGQuery
11/02/2009                    BCHB524 - 2009 - Edwards   9
Using ESearch
    By default only get back some of the ids:
            Use retmax to get back more…
            Meaning of returned id is database specific…




11/02/2009                           BCHB524 - 2009 - Edwards   10
Using EFetch




11/02/2009     BCHB524 - 2009 - Edwards   11
ESearch and EFetch together
    Entrez provides a more efficient way to
     combine ESearch and EFetch
            After esearch, Entrez already knows the ids you
             want!
            Sending the ids back with efetch makes Entrez
             work much harder
    Use the history mechanism to "remind"
     Entrez that it already knows the ids
    Access large result sets in "chunks".
11/02/2009                    BCHB524 - 2009 - Edwards         12
ESearch and EFetch together




11/02/2009   BCHB524 - 2009 - Edwards   13
NCBI Blast
    NCBI provides a
     very powerful
     blast search
     service on the
     web
    We can access
     this infrastructure
     as a web-service
    BioPython makes
     this easy!
            Ch. 7 in Tutorial


11/02/2009                       BCHB524 - 2009 - Edwards   14
NCBI Blast
    Lots of
     parameters…
    Essentially
     mirrors blast
     options
    You need to
     know how to
     use blast first!



11/02/2009              BCHB524 - 2009 - Edwards   15
NCBI Blast
    Required parameters:
            Blast program, Blast database, Sequence
            Returns XML format results, by default.
    Save results to a file, for parsing…




11/02/2009                            BCHB524 - 2009 - Edwards   16
NCBI Blast Parsing
    Results need to be parsed in order to be useful…




11/02/2009                 BCHB524 - 2009 - Edwards     17
Lab exercises
    Try each of the examples shown in these slides.

    Read chapters 7 & 8 of the BioPython tutorial
            Try each of the examples, especially sec. 8.13


    Write a program using NCBI's EUtils to print 20
     PubMed abstracts referencing the BRCA1 gene.
            Change your program to print the 20 most recent PubMed
             abstracts that reference the BRCA1 gene.


11/02/2009                       BCHB524 - 2009 - Edwards             18
Lab exercises
    Write a program using NCBI's EUtils to find
     and retrieve the RefSeq human BRCA1
     proteins from NCBI.
            Use Query:
             "Homo sapiens"[Organism] AND BRCA1[Gene Name] AND REFSEQ

            Change your program to search these protein(s)
             using the NCBI blast web-service against all
             RefSeq proteins
            Change your program to filter the returned results
             appropriately to find a list of RefSeq mouse
11/02/2009
             BRCA1 orthologs.    BCHB524 - 2009 - Edwards               19

								
To top