Compound Library Annotation

Document Sample
Compound Library Annotation Powered By Docstoc
					        Compound Library Annotation
                  Miklós Vargyas




                                      1
May, 2005
                                                                                               Slide 2


                                         Compound Library Annotation


                         Overview of the Screen Package
                                  Virtual screening
                                  Optimized dissimilarity metrics
                                  Clustering


                         Library Annotation – a real-life application of the Screen tool
                                  Approach #1:
                                      • use command line applications
                                  Approach #2:
                                      • API programming




Compound Library Annotation — UGM 2005
                                                                                           2
                                                                                                                                 Slide 3


                                         Overview of the Screen Package
                                           0100010100011101010000110000101000010011000010100000000100100000
                                           0001101110011101111110100000100010000110110110000000100110100000
                                           0100010100110100010000000010000000010010000000100100001000101000
                                           0101110100110101010111111000010000011111100010000100001000101000
                                           0001000100010100010100100000000000001010000010000100000100000000
                                           0100010100010100000000000000101000010010000000000100000000000000
                                           0101010101111100111110100000000000011010100011100100001100101000
                                           0100010100011000010000011000000000010001000000110000000001100000
                                           0000000100000000010000100000000000001010100000000100000100100000




                                           0101110100110101010111111000010000011111100010000100001000101000
                  queries
                                                      hypothesis fingerprint



                                                                    metric
                                           0000000100001101000000101010000000000110000010000100001000001000
                                           0100010110010010010110011010011100111101000000110000000110001000
                                           0100010100011101010000110000101000010011000010100000000100100000
                                           0001101110011101111110100000100010000110110110000000100110100000
                                           0100010100110100010000000010000000010010000000100100001000101000
                                           0100011100011101000100001011101100110110010010001101001100001000
                                           0101110100110101010111111000010000011111100010000100001000101000
                                           0100010100111101010000100010000000010010000010100100001000101000
                                                                                                              Virtual hits
                                           0001000100010100010100100000000000001010000010000100000100000000
                                           0100010100010011000000000000000000010100000010000000000000000000
                                           0100010100010100000000000000101000010010000000000100000000000000
                                           0101010101111100111110100000000000011010100011100100001100101000
                                           0100010100011000010000011000000000010001000000110000000001100000
                                           0000000100000000010000100000000000001010100000000100000100100000
                                           0100010100010100000000100000000000010000000000000100001000011000
                                           0001000100001100010010100000010100101011100010000100001000101000
               targets                     0100011100010100010000100001001110010010000010001100000000101000
                                           0101010100010100010100100000000000010010000010010100100100010000


                                                         target fingerprints
Compound Library Annotation — UGM 2005
                                                                                                                             3
                                                                               Slide 4


                                                       Need for Optimization


                                         0.57




                            0.47                0.55




Compound Library Annotation — UGM 2005
                                                                           4
                                                                                                                                              Slide 5


                                                                                  Optimized Metrics

                   D                   ( x, y )  1 
                     scaled , asymmetric                                                s min( x , y )
                                                                                          i i             i   i


                                                        x   s min( x , y )   1    y   s min( x , y )    s min( x , y )
                     Tanimoto

                                                         i i      i i     i   i                   i i             i i   i   i   i i   i   i



                    0,1                asymmetry factor

                  si  N                   scaling factor



                                                                wi xi  yi                  wi 1   xi  yi 
                                                                                                                                 2
                   DEuclideanasymmetric( x, y) 
                    weighted ,                                                        2

                                                               xi  yi                          xi  yi




                    0,1                asymmetry factor

                  wi  0,1 weights

Compound Library Annotation — UGM 2005
                                                                                                                                          5
                                                                            Slide 6


                          Improved Similarity by Optimization


                                         0.57




                            0.47                0.55



                                                          0.20


                                                                 0.06

                                                       0.28




Compound Library Annotation — UGM 2005
                                                                        6
                                                                                                              Slide 7


                                                       Enrichment by Optimization


                                   10000
                  Number of Hits




                                   1000

                                    100

                                     10

                                      1
                                           1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18
                                                                   Number of Active Hits

                                                   Tanimoto          Euclidean      Optimized   Ideal



Compound Library Annotation — UGM 2005
                                                                                                          7
                                                                                  Slide 8


                                                                     Clustering

                         8 active compound sets
                                  •      ACE inhibitors
                                  •      angiotensin 2 antagonists
                                  •      D2 antagonists
                                  •      delta antagonists
                                  •      FTP antagonists
                                  •      mGluR1 antagonists
                                  •      Thrombin inhibitors
                                  •      5-HT3-antagonists




Compound Library Annotation — UGM 2005
                                                                              8
                                                          Slide 9


                                         Ward Centroids




Compound Library Annotation — UGM 2005
                                                      9
                                                           Slide 10


                 Maximum Common Substructure Clustering




Compound Library Annotation — UGM 2005
                                                      10
                                                                                               Slide 11


                                         Compound Library Annotation

                         Annotate library: predicted activity in some therapeutic areas




                                                               ActACE=0.5   Actß2=0.98




                                                               ActACE=0.78 Actß2=0.45




Compound Library Annotation — UGM 2005
                                                                                          11
                                                                                                         Slide 12


                            Similarity Based Activity Prediction

                         Use sets of known actives to predict activity of compounds




                                                                                           ActACE=0.55

                        0101110100110101010111111000010000011111100010000100001000101000




                                                                                           Actß2=0.98




                        0101110100110101010111111000010000011111100010000100001000101000




Compound Library Annotation — UGM 2005
                                                                                                    12
                                                                               Slide 13


               Approach #1: Off the Shelf ChemAxon Tools


                         Parameter setting
                                  Pharmacophore fingerprint
                                  Tanimoto dissimilarity metric
                                  Median Pharmacophore Hypothesis




                         screenmd library.sdf ace.sdf \
                                         –o SDF annotated-library.sdf \
                                         -k PF –M Tanimoto –H Median




Compound Library Annotation — UGM 2005
                                                                          13
                                                                 Slide 14


                                         Single Active Family




Compound Library Annotation — UGM 2005
                                                            14
                                                                               Slide 15


                                         Multiple Active Families

               screenmd library.sdf ace.sdf \
                     -o SDF lib-ace.sdf -k PF –M Tanimoto –H Median
               screenmd lib-ace.sdf beta2.sdf \
                     -o SDF lib-ace+beta2.sdf -k PF –M Tanimoto –H Median
               screenmd lib-ace+beta2.sdf delta.sdf \
                     -o SDF lib-ace+beta2+delta.sdf -k PF –M Tanimoto \
                     –H Median
               screenmd lib-ace+beta2+delta.sdf D2.sdf \
                     -o SDF lib-ace+beta2+delta+D2.sdf -k PF –M Tanimoto \
                     –H Median
               ...
               ...


Compound Library Annotation — UGM 2005
                                                                          15
                                                                   Slide 16


                                         Annotated Library File




Compound Library Annotation — UGM 2005
                                                              16
                                                                                                     Slide 17


                   Approach #2: Using ChemAxon JChem API



                         API programming – custom solution
                                   PharmacophoreFingerprint and the MolecularDescriptor class
                                       hierarchy
                                   Tanimoto dissimilarity calculation
                                   Median Hypothesis calculation
                                   Description generation for structure in SDfile
                                   Writing structures in SDfile




Compound Library Annotation — UGM 2005
                                                                                                17
                                                                                                   Slide 18


                         MolecularDescriptor class hierarchy

                                                       Molecular Descriptor




              Pharmacophore
                                                                                       BCUT
                   Fingerprint

                                         Chemical
                                                                              CUSTOM
                                         Fingerprint
                                                             MACCS



Compound Library Annotation — UGM 2005
                                                                                              18
                                                                                    Slide 19


                                         MolecularDescriptor Sets

                                         Molecular Descriptor
                                                 Set




                           Molecular          Molecular          Molecular
                          Descriptor 1       Descriptor 2       Descriptor 3
                            (e.g. CFp)        (e.g. PFp)        (e.g. logP)




Compound Library Annotation — UGM 2005
                                                                               19
                                                                                   Slide 20


                                             Dissimilarity Calculation

              MDSet s1 =
                               MDSet.newInstance( new String[]{“CF”,”PF”,”LogP”} )
              MDSet s2 =
                               MDSet.newInstance( new String[]{“CF”,”PF”,”LogP”} )


              . . . Generate s1 and s2 somehow . . .


              System.out.println( “dissimilarity(s1,s2) = “ +
                               s1.getDissimilarity( s2 ) );




Compound Library Annotation — UGM 2005
                                                                              20
                                                                              Slide 21


                                Tanimoto Dissimilarity Calculation

              MDSet s1 =
                               MDSet.newInstance( new String[]{“PF”} )
              MDSet s2 =
                               MDSet.newInstance( new String[]{“PF“} )


              . . . Generate s1 and s2 somehow . . .


              PharmacophoreFingerprint pf1 = s1.getDescriptor(0);
              PharmacophoreFingerprint pf2 = s2.getDescriptor(0);


              System.out.println( “Tanimoto(pf1,pf2) = “ +
                               pf1.getTanimoto( pf2 ) );


Compound Library Annotation — UGM 2005
                                                                         21
                                                                              Slide 22


                                         Median Hypothesis Calculation

              MDSet s1 =
                               MDSet.newInstance( new String[]{“PF”} )
              MDSet s2 =
                               MDSet.newInstance( new String[]{“PF“} )
              . . . Generate s1 and s2 somehow . . .


              MDHypothesisGenerator medianHypoGenerator =
                     MDHypothesisCreator.create( "Median" );
              medianHypoGenerator.add( s1 );
              medianHypoGenerator.add( s2 );


              MDSet hypothesis = medianHypoGenerator.generate();


Compound Library Annotation — UGM 2005
                                                                         22
                                                                                    Slide 23


                     Reading Descriptors from Structure File



              MDFileReader inputReader
                               = new MDFileReader( “library.sdf”,
                                      MDSet.newInstance( new String[]{"PF"} ) );


              MDSet mdRead = inputReader.next();




Compound Library Annotation — UGM 2005
                                                                               23
                                                                                       Slide 24


                                            Writing structures in SDfile

              MolExporter outputWriter =
                               new MolExporter( new PrintStream(
                                         new BufferedOutputStream(
                                         new FileOutputStream( fileName ))), "sdf");


              Molecule m = getAMolecule();
              outputWriter.write( m );




Compound Library Annotation — UGM 2005
                                                                                  24
                                                                               Slide 25


                                                LibAnnot class


                                         Full source code avaialable at
                                         http://www.chemaxon.com/




Compound Library Annotation — UGM 2005
                                                                          25
                                                                        Slide 26


                                                      Future plans

                  • New MolecularDescriptors (e.g. 3D Pharmacophore)
                  • Non-hierarchical MCS clustering, better GUI
                  • Library diversity estimation




Compound Library Annotation — UGM 2005
                                                                   26
                                                                           Slide 27


                                                         Summary


               • Screen+JKlustor for optimized virtual screening and hit
                 set profiling


               • Library annotation by screenmd


               • Library annotation by custom program




Compound Library Annotation — UGM 2005
                                                                      27
                                                                         Slide 28


                                         Acknowledgements and Credits


               • JKlustor developed by Ferenc Csizmadia et al
               • Optimizer developed by Zsuzsa Szabó
               • PMapper developed by Szilárd Dóránt, Nóra Máté
               • Pharmacophore definitions by György Pirok




Compound Library Annotation — UGM 2005
                                                                    28
                                                  Slide 29


              Thank you for your attention




                         Máramaros köz 3/a
                         Budapest, 1037
                         Hungary

                         info@chemaxon.com
                         www.chemaxon.com

Compound Library Annotation — UGM 2005
                                             29