Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

IntAct Molecular Interaction Database by vps11289

VIEWS: 14 PAGES: 70

									                                                              IntAct
                                                        Molecular Interaction
                                                             Database




                                      Master headline
                                                            APO-SYS 2008
                                                            25th June 2008
Samuel Kerrien (skerrien@ebi.ac.uk)                         Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
Outline
   1. Lecture: The IntAct Database (45‟)
      1. History of the project
      2. The data we are dealing with
      3. Additional resources (UniProt, Newt, OLS)
      4. Data Standards for molecular interaction
      5. IntAct Applications allowing you to:
          1. Browse - Search
          2. Visualize – HierarchView


   2. Practical Session (60‟)

      Things to remember for the practical session

          Master headline
Genomics and Proteomics




  DNA                Protein

Genomics         Proteomics
           RNA                         Small Molecules

   Transcriptomics                   Metabolomics

  Functional Genomics/Proteomics




                 What data are we dealing with ?
Why are we interested in Interactions ?

1.   As a means of precisely understanding a protein role
     inside a specific cell type

2.   Guilt by Association – it may be the only means of
     predicting a protein‟s function

3.   As building blocks for System‟s Biology




                   What data are we dealing with ?
IntAct goals & achievements
 1. Define a standard for the representation and
    annotation of protein interaction data
    - Curation manual available from home page
    - Member of the International Molecular interaction Exchange consortium (IMEx)

 2. provide a public repository
    http://www.ebi.ac.uk/intact
    ftp://ftp.ebi.ac.uk/pub/databases/intact

 3. populate the repository with experimental data from
    project partners and curated literature data
    3300+ distinct publications, 169,000+ binary interactions,
    63,000+ proteins imported from UniProt


 4. provide modular analysis tools
    search & advanced search, hierarchView, pay-as-you-go, MiNe…

 5. provide portable versions of the software to allow
    installation of local IntAct nodes.
    Known installation: AstraZeneca, GSK, MERCK, MINT
                        Proteome Center of Shanghai

                 Master headline
Statistics




    Master headline

                      IntAct at a glance
Interactome coverage
  • Only a fraction of all published interactions is
   captured in interaction databases
  • An end is not in sight, the interaction space is still
   vastly under-sampled




                                               Christian Kohler
           Master headline
Public data

  •   All data is manually curated by expert curators
  •   Curation manual rigorously followed
  •   All curated data is reviewed by a senior curator
  •   Topic centric dataset available (eg. Apoptosis)
  •   All data is made available on FTP site:
                    Data

                ftp://ftp.ebi.ac.uk/pub/databases/intact

      (!) data updated every week
      (!) format available:


        Master headline

                            IntAct at a glance
   How to model an interaction
                                                       Participant3

                                    Interaction1
                                                                                     Protein1
                                                       Participant1
              Experiment1               Interaction2
                                                       Participant2

                                     Interaction3                                    Protein2



Publication
              Experiment2            Interaction4

                                                                                    . Roles
                                                                                    . Features




                                                                      Participant
                                                                                    . Preparations



                  Master headline
Data model
 •     Support for detailed features
       i.e. definition of interacting interface



            Interacting domains




     Overlay of Ranges on sequence:




              Master headline
Controlled vocabularies
 •   Why do we use them ?

     e.g. more than 20 ways to write:
                     yeast two hybrid, Y2H, 2H, two-hybrid, …


 •   Full integration of PSI-MI ontology

 •   Over 1,200 terms, fully defined and cross-referenced




            Master headline
 Controlled vocabularies
     •   These controlled vocabularies are hierarchical;
         of various size and complexity.
                                              Interactor types




Interaction detection methods




                   Master headline
How to deal with Complexes
 •   Some experimental protocol do generate complex data:
     Eg. Tandem affinity purification (TAP)

 •   One may want to convert these complexes into sets of
     binary interactions, 2 algorithms are available:




             Master headline
Other useful databases

•   You will need to know a little more about the following
    databases to do the practical part of this session :

    •   Ontology Lookup Service

    •   UniProtKB (Universal Protein Resource)

    •   Newt (NCBI taxonomy)




         Master headline
                  Ontology Lookup Service
                                       •   Makes available OBO controlled vocabularies
http://www.ebi.ac.uk/ontology-lookup



                                       •   Web site allows for searching and browsing their
                                           hierarchy




                                                    Master headline
                          Ontology Lookup Service
                                       •   Each term has a definition as well as literature reference
http://www.ebi.ac.uk/ontology-lookup




                                                  Master headline
             UniProt Knowledge Base
                              •   Swiss-Prot: Manual annotations (~300,000 proteins)
                              •   TrEMBL: Automatic (~3,300,000 proteins)
http://www.ebi.uniprot.org/




                                          Master headline
                   UniProt Knowledge Base
http://www.ebi.uniprot.org/




                              •   Interactions in IntAct are using Splice Variants




                                           Master headline
                          UniProt Knowledge Base
                              •   Summary:
                                  • Master Protein: P60953
                                  • Splice variants / Isoform: P60953-1, P60953-2
http://www.ebi.uniprot.org/




                                         Master headline
                   UniProt Knowledge Base
                              •   IntAct exports interaction data to UniProt.

                              •   Only interactions detected by specific methods are
                                  exported. Mostly physical -> higher quality interactions
http://www.ebi.uniprot.org/




                                        Master headline
Newt
 •   Web Interface to the NCBI taxonomy




          Master headline
Newt




       Master headline
IntAct Applications
  •   Now we are going to see :
      •   IntAct home page
      •   How to search data
      •   Building complex queries
      •   How to navigate from
          protein -> interaction -> experiments -> publication

      •   How to visualize interaction networks using:
          •       IntAct tools
          •       A third party application: Cytoscape




              Master headline
    IntAct – Home Page
http://www.ebi.ac.uk/intact




                              Master headline
Software demonstration




•   Web application: binary search
    •   Simple, yet powerful search engine
    •   Binary interaction centric
    •   Advanced search – how to build complex queries
    •   Entry point to other applications


          Master headline
                                   Browsing – binary search
First search from the home page…




                                        UniProtKB          Newt   PubMed   OLS    Details of
                                                                                 interaction


                                         Master headline
    PSIMITAB columns


                                    +
Standard columns (15):
                                        IntAct specific columns (+11):
• ID(s)  interactor A & B
                                        • Experimental role(s) of interactors
• Alt. ID(s) interactor A & B
                                        • Biological role(s) of interactors
• Alias(es) interactor A & B
                                        • Properties (CrossReference) of interactors
• Interaction detection method(s)
                                        • Type(s) of interactors
• Publication 1st author(s)
                                        • HostOrganism(s)
• Publication Identifier(s)
                                        • Expansion method(s)
• Taxid interactor A & B
                                        • Dataset name(s)
• Interaction type(s)
• Source database(s)
• Interaction identifier(s)
• Confidence value(s)

                  Master headline
                       Browsing – binary search
                                   •   Using the IntAct query language, one can also build
                                       complex queries
First search from the home page…




                                   •   List of terms one can query on :




                                             Master headline
                                Browsing – binary search

                                •   Advanced search gives access to more options…
How to build complex queries…




                                        Master headline
                                Browsing – binary search
                                •   Advanced search gives access to more options…
How to build complex queries…




                                        Master headline
Software demonstration




•   Web application: detailed search
    •   IntAct original search interface
    •   More detailed information about experiment,
        interaction, interactor…
    •   Entry point to other applications



         Master headline
                                   Browsing – binary search
First search from the home page…




                                                               Details of
                                                              interaction


                                         Master headline
                        Browsing – binary search
Viewing details of an interaction…




                                     Master headline
                           Browsing - search
Search result for „RAD1‟




                                 Master headline
                              Browsing - search
Binary view of o60671_human




                                                      Protein selected
                                    Proteins known to interact with o60671_human



                                    Master headline
                       Browsing - search
Details of a Protein




                             Master headline
                            Browsing - search
Binary view of rad1_yeast




                                  Master headline
                  Experiment view
                  Interaction between rad1_yeast and sahh_yeast




Master headline
                                                                  Browsing - search
                                 Browsing - search
Details of an Interaction Type




                                   •   All CVs can be clicked, giving access to:
                                       •      Comprehensive definition
                                       •      Cross references

                                           Master headline
                                   Browsing - search
An interaction involving feature




                                         Master headline
                                   Browsing - search
An interaction involving feature




                                         Master headline
    Software demonstration




•   Web application: hierarchView
    •   2D visualization of molecular interaction network
    •   Interactive expansion of network
    •   Highlight of proteins in context of their
        GO/InterPro annotations
    •   Download of network in PSI-MI XML
    •   Can be combined with third party software
        (e.g. Cytoscape)
             Master headline
                               Visualizing - hierarchView
From search to hierarchView…




                                     Master headline
                                    Visualizing - hierarchView
Description of the user interface




                                          Master headline
                                    Visualizing - hierarchView

                                                                          2D interaction network
Description of the user interface




                                                              Search box
                                                    supports list of interactors
                                           Add interactions to current network
                                                Network expansion
                                                   around all selected interactor


                                                   Mouse click behaviour
                                                                                                Protein‟s annotations
                                                                                                 count of proteins sharing a term
                                                 Download current network                                   selection for highlight
                                                                   currently PSI-MI 1.0 & 2.5
                                                                                                          display of GO hierarchy
                                     1..n selected proteins



                                            Master headline
                                    Visualizing - hierarchView
Expansion of the existing network




                                          Master headline
                                    Visualizing - hierarchView
Expansion of the existing network




                                          Master headline
                             Visualizing - hierarchView
Highlight of GO annotation




                                   Master headline
                             Visualizing - hierarchView
Highlight of GO annotation




                                                          Go term highlight
                                                                   Select single term
                                                            Select term and children




                                   Master headline
                             Visualizing - hierarchView
Highlight of GO annotation




                                   Master headline
                             Visualizing - hierarchView
Download of the current network




                                                          Data aggregation
                                                               Gavin et al. (2002)
                                                               Gavin et al. (2006)
                                                                        Uetz et al.
                                                                         Ho et al.
                                                                         Ito et al.




                                      Master headline
Engineering 1850                 Proteomics 2003
• Nuts and bolts fit             • Proteomics data are perfectly
 perfectly together, but          compatible, but only if they are
 only if they originate from      from the same
 the same factory                 lab / database / software

• Standardisation proposal       • “Publish and vanish” by data
 in 1864 by William Sellers       producers

• It took until after WWII       • Collecting all publicly available
 until it was generally           data requires huge effort
 accepted, though …
                                 • Urgent need for standardisation

               Master headline
   PSI-MI XML format
• Community standard for Molecular Interactions

• XML schema and detailed controlled vocabularies

• Jointly developed by major data providers: BIND,
 CellZome, DIP, GSK, HPRD, Hybrigenics, IntAct,
 MINT, MIPS, Serono, U. Bielefeld, U. Bordeaux, U.
 Cambridge, and others

• Version 1.0 published in February 2004
The HUPO PSI Molecular Interaction Format - A community standard for the
representation of protein interaction data.
Henning Hermjakob et al, Nature Biotechnology 2004, 22, 176-183.



• Version 2.5 published in October 2007

       Master headline
 PSI-MI XML benefits

• Collecting and combining data from different
 sources has become easier,

• standardized annotation through PSI-MI
 ontologies,

• tools from different organizations can be chained,
 e.g. analysis of IntAct data in Cytoscape.


              Home page

                     http://www.psidev.info


   Master headline
                                                            Data & tools - interoperability
Loading IntAct data into Cytoscape
                                 http://www.cytoscape.org




                                                            •   Step 1: get the URL to the data or download a file
                                                                http://www.ebi.ac.uk/intact/graph2mif/getXML?ac=EBI-14752&depth=1&strict=false

                                                            •   Step 2: install Cytoscape
                                                                http://www.cytoscape.org/
                                                            •   Step 3: load data into Cytoscape




                                                                                                                 PSI-MI Import Plugin


                                                                         Master headline
                  Loading IntAct data into Cytoscape
                                 http://www.cytoscape.org




Master headline
                                                            Analyzing - interoperability
                                                            Analyzing - interoperability
Loading IntAct data into Cytoscape
                                 http://www.cytoscape.org




                                                            •   Combine high quality data and powerful visualization tool

                                                            •   Cytoscape can deal with large network

                                                            •   Multiple graph layout algorithms integrated

                                                            •   Many plugins available:
                                                                •   Data loading (PSI-MI, cPath, BioPax, SBML)
                                                                •   Data Mining from literature
                                                                •   Analysis and manipulation of interaction network


                                                                     Master headline
 The IMEx consortium

• International Molecular-Interaction Exchange
 consortium


• DIP, IntAct, MINT will regularly exchange user-
 submitted data in PSI-MI format from beginning of
 2006 onwards to provide a network of stable,
 comprehensive resources for molecular interaction data


• Manual annotation is a very expensive task, why
 should we annotated many times the same datasets ?!



        Master headline
MIMIx




        Master headline
Summary
IntAct provides freely available, open source
database and toolkit for the analysis of interaction
data with support of the PSI-MI standards
             Home page

              http://www.ebi.ac.uk/intact
              Download

             http://sourceforge.net/projects/intact


All data publicly available, released weekly
                 Data

              ftp://ftp.ebi.ac.uk/pub/databases/intact

Any problems? Send a mail to our support team !
                  IntAct home page > Contact

       Master headline
                                         IntAct team •Henning Hermjakob
                                                             •Sandra Orchard
                                                             •Jyoti Khadake
                                                             •Luisa Montecchi
                                                             •Dave Thorneycroft
                                                             •Cathy Derow
                                                             •Catherine Leroy
                                                             •Bruno Aranda
                                                             •Bernd Roechert (SIB)
                                           PSI participants
                                           (in particular)

                                                             •Gary Bader, MSKCC
                                                             •Lukasz Salvinski, DIP
                                                             •John Salama, BIND




                                Master headline


IntAct is funded by the European Commission under FELICS, contract number 021902 (RII3)
   ?   ?
 ?       ?    ?               ?           ?
?
 ?    ?
     ?       ?   ?                ?



? ? ?
       ?   ?
        Master headline
                          ?           ?
Overview of PSI-MI XML 2.5 Schema




     Master headline
PSI-MI 2.5 Standards




        Master headline
Bird’s eye view of PSI-MI XML 2.5

                            • Top level structure unchanged
                             compared to PSI-MI 1.0


                            • Use of Id/Ref on main objects




         Master headline
Main objects - Experiment

                            Literature
                            references




   Id
                               Controlled by
                               Ontologies




                             Confidence
                             measures

        Master headline
Main objects - Interactor


                            Reference to a public
                            database
                            Generic interactor

   Id




         Master headline
Main objects - Interaction


                                  Copyright
                              Experiment
    Id
                             Controlled
                             by
                             Ontology


                               Confidence value
                               Kinetics
                               parameters

         Master headline
Basics – Controlled Vocabularies
  •   Why ?
      •   Ensure data consistency
      •   Provide reliable mean for searching & filtering data
  •   How ?
      •   By providing a reference to an ontology term




           Master headline
Main objects - Participant


                                    Interactor


                             Building of Complex
                               e.g. enzyme target
     Id
                                 e.g. bait, prey

                                       Delivery method
                                       expression level…

                            Interactor used
                             experimentally



          Master headline

								
To top