; In Silico Mutagenicity DEREK
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

In Silico Mutagenicity DEREK

VIEWS: 393 PAGES: 26

  • pg 1
									      Predicting Genotoxicity and
      Carcinogenicity: Mutants and
               Modeling

                  Constantine Kreatsoulas, Ph.D.

                        Bristol-Myers Squibb*




*Current   address: 126 E. Lincoln Ave., Rahway, NJ 07065
                  Predictive Genotoxicity
Goal: Improve current predictive toxicology capabilities for mutagenicity and
   carcinogenicity through customizing and augmenting current predictive software

1. Modeling & Informatics:
     Enhancing current predictive software.
           – Bias model to minimize false negatives (and indeterminants).
     Provide support to discovery groups to eliminate mutagenic liabilities.

     Create a central data repository and populate it with literature data as well as
          institutional data.
     Deliver a predictive mutagenicity package in a format that can be supported as a
          standard system.
     Allow for novel models to be added as they are developed

3. Use:
     Prioritization of synthesis & testing candidates.

     Identification of substructures responsible for an observed mutagenic liability and
          suggested synthetic alternatives.
     Regulatory and due diligence support (what will the FDA see?).
     Descriptive Relationships

                  is most similar to the:




  Which Descriptor? Color? Shape? Function?


What’s the important piece?

How do we know?
   First SAR Modeling Attempts
“Are you sure, Stan, that a pointy head and a long
         beak is what makes them fly?”




                                      Gilles Klopman -- MULTICASE
Analysis of Commercial Computational
         Toxicology Software
          QSAR-BASED                         EXPERT RULE-BASED
  Collects Molecular Fragments and       Inspects Molecules for Known
   Descriptors                             Structural Liabilities
  Calculates Values of Chemical          Identifies Structural Liabilities
   Descriptors
                                          Prepares Summary Report of
  Compares to Known Compounds             Findings
  Reports Probability of Being a         Validated Structural
   Member of a Toxic Class Using           Relationships with Known Toxic
   Multifactorial Statistical Analysis
                                           Mechanisms
  Identifies Structural Liabilities
                                          Provides References & Predicted
  Unvalidated Structural                  Mechanisms
   Relationships




ADAPT/TOPKAT  MultiCASE/LeadScope  DEREK
      Strengths and Weaknesses of
 Virtual Toxicology Commercial Software

              QSAR-BASED                             EXPERT RULE-BASED
  Provide Relative Dose and Liability            Chemically Intuitive Results
    Prediction
                                                  Good Initial Filter for Known Liabilities:
  Easy to Determine if Compound is Well            Lacks Specificity
    Represented in Training Set via Similarity
    Search                                        Only Predicts Presence of Identified

  Can Be Biased to Minimize False
                                                    Fragments
    Positives and/or False Negatives              Cannot Discriminate within a
  Challenging to Systematically Improve            Structural Sub-Class
    Model: No Linearity                           Retrospective in Nature
  Difficult to Train General Model:
                                                  Cannot Extrapolate Prediction to New
    Excellent Predictiveness for Single
    Event; Problematic for Multiple Events          Chemotypes


 Good For Specific Models                             Good For General Models


ADAPT/TOPKAT  MultiCASE/LeadScope  DEREK
Requirements for a Sentinel Filter

Identify ALL compounds having mutagenic liability
    Identify strengths & weaknesses of models

    Identify strategy for maintaining & improving the model

    User friendly & intuitive

    Provide support information for model




Chemists’ and toxicologists’ needs are not always equivalent
    Chemistry:
       – Suggest synthetic alternatives; do not limit chemical space
       – Repository of prior knowledge (both institutional and external)

    Toxicology:
       – Prioritization of synthesis and in vitro testing candidates
       – Regulatory and due diligence support; overprediction is acceptable
                      2002 Validation
                     Expanded Data Set

  C la s s        % S a m p le   # Com pounds                                Drugs:
   BM S               23%         4 1 6 (6 5 / 3 5 1 )      All Data:         534
                                                                           Compounds
                                                              1825
  D ru g s            29%         5 3 4 (1 0 7 / 4 2 7 )   Compounds

  O th e r            48%         8 7 5 (3 9 8 / 4 7 7 )
                                                                      BMS:
Am es Pos:            31%                 570                          416
                                                                    Compounds
Am es Neg:            69%                1255




             •~5% of BMS space covered by validation compounds.
             •~10% of drug space covered by validation compounds.
                     Which program works best?
                       A combination of two

Concordance                                                                        Random




Indeterminate




     False (-)




    False (+)



                 0      100        200   300         400        500   600        700       800
           DEREK (DK)                    TOPKAT (TPK)                  MultiCASE (MC)
           Parallel DK/MC                Parallel DK/TPK               Parallel MC/TPK
           Parallel DK/MC/TPK            Sequential D/MC               Sequential MC/TPK
           Sequential DK/MC & DK/TPK     Sequential DK/MC/TPK          Random
                                              Improving the System:
                 Correction of the 2o Amine Alert
                         20/51 (39%) compounds triggering DR005 were predicted
                                        positive by an Ames assay (S9)
S u b stru ctu re                N am e            # P os        # O ccu rren ces
             H                                                                      Derek Rule 005 Addendum
                                                                                         Exclude Secondary Amides
             N               O
                                    S econdary
   R'                                                       0            10
                                      A m ides
                     R
                                                                                         Exclude Secondary
             H

                         O
                                                                                          Sulfonamides
             N
                                    S econdary
   R'                S                                      0             8
                             R
                                   S ulfonam ide                                    Modified DR 005 Correctly Predicts
                 O
                                                                                     20/35 Compounds (57%
                 H                                                                   concordance).
                                    S econdary
                 N                    A niline              19           30
        Ar               R
                                    D erivatives
                                                                                    Reduced False Positives from 31 to
                                                                                     15.
                 H

                                    S econdary                                      Additional Rules and QSARs can
                 N                                          1             5
        R'               R            A m ines                                       be Developed to Improve the
                                                                                     Accuracy of this Rule Even
                                                                                     Further.
                        Improving the System:
Substructures Identified by BMS
                            Am es         BMS DEREK        P o s itiv e
 A le rt N a m e
                           P o s itiv e    P o s itiv e   A c c u ra c y
S u b stru ctu re 1            42              77            5 4 .5 %
S u b stru ctu re 2            41              167           2 4 .6 %
S u b stru ctu re 3            16              70            2 2 .9 %
S u b stru ctu re 4            42              89            4 7 .2 %
S u b stru ctu re 5             3               3           1 0 0 .0 %
S u b stru ctu re 6            54              110           4 9 .1 %
S u b stru ctu re 7            16              18            8 8 .9 %
S u b stru ctu re 8             4               6            6 6 .7 %
S u b stru ctu re 9             5               7            7 1 .4 %
S u b stru ctu re 1 0          52              84            6 1 .9 %
S u b stru ctu re 1 1          86              128           6 7 .2 %
S u b stru ctu re 1 2           1               2            5 0 .0 %
S u b stru ctu re 1 3           4              10            4 0 .0 %
                   Performance Assessment:
      Has DEREK been Improved?
    1825 Compounds: 1255 Ames (-) , 570 Ames (+) Random Conc.: 57%

         DEREK on Complete                                       BM S DEREK on
             Data Set                                           Complete Data Set


False Pos
                                 917             True Pos                                       True Neg
                                                                                        702
                                                                             553
                                                                   502
             412      338


                        158
                                                                               68
                                       Exp Neg               Pred Pos                         Exp Neg
       Pred Pos
                  Pred Neg   Exp Pos                                     Pred Neg   Exp Pos


                                          False Neg
         Concordance: 73%                                   Concordance: 66%
         Sensitivity: 72%                                   Sensitivity: 88%
         Specificity: 73%                                   Specificity: 56%
                    Performance Assessment:
       Has DEREK been Improved?
  534 Merck Index Drugs: 427 Ames (-), 107 Ames (+) Random Conc.: 68%

             DEREK on Drugs                                    BM S DEREK on Drugs


                                   328
                                                                                                  True Neg
False Pos                                                                                 248
                                                   True Pos                   179

                        99
                                                                     85
              69
                          38                                                     22
                                         Exp Neg              Pred Pos                          Exp Neg
       Pred Pos
                   Pred Neg    Exp Pos                                    Pred Neg    Exp Pos



                           False Neg
            Concordance: 74%                                  Concordance: 62%
            Sensitivity: 65%                                  Sensitivity: 79%
            Specificity: 77%                                  Specificity: 58%
              Performance Assessment:
      Has DEREK been Improved?
   416 BMS Compounds: 351 Ames (-), 65 Ames (+) Random Conc.: 74%

             DEREK on BM S                                     BM S DEREK on BM S
              Compounds                                            Compounds



False Pos                         271
                                                                                                   True Neg
                                                                             162        189
                                                  True Pos

                       80
                                                                    60
              35
                         30                                                     5
                                                              Pred Pos                        Exp Neg
        Pred Pos                        Exp Neg
                   Pred Neg   Exp Pos                                    Pred Neg   Exp Pos



                        False Neg
         Concordance: 74%                                    Concordance: 60%
         Sensitivity: 54%                                    Sensitivity: 92%
         Specificity: 77%                                    Specificity: 54%
                 BMS Enhanced DEREK
     SAR Around Substructures
Alerts are very general: target a specific chemical functionality
 Molecular context might modulate the toxicity

 Probe an individual mechanism & test hypothesis of action
 SAR around the toxic moiety
     Activating features - what makes the toxicity worse
     Deactivating features - what obviates the toxicity
 Collaboration with Prof. Peter Jurs (Penn State)
 Collaboration with Prof. Chihae Yang (LeadScope & Ohio State)

Target toxicophores which are relevant to discovery first, then expand
knowledge-base
     Secondary and Aromatic Amines
     Thiophenes
     Polycyclic Aromatic Hydrocarbons
     Quinolines / Quinolones
               Predictive Toxicology
  Comparing Apples to Apples
Secondary and Aromatic Amines:
The Data Set: 334 Compounds
      Selected for drug-likeness (expanded Lipinski filter)
      Clustered for diversity
      Commercially available from Aldrich at over 96% purity

Assayed in the SOS Chromotest assay for genotoxicity
      Induction of lacZ reporter gene under transcriptional control of
       SOS DNA damage repair pathway
      90% concordance with the Ames Assay
      High Reproducibility (± 0.05 fold)
      193 compounds considered non-toxic
      72 compounds considered weakly toxic
      69 compounds considered strongly toxic
                 Comparing Bad Apples

   Method                     % Concordance % Sensitivity % Specificity
   ADAPT                           72           69            74
 TopKat (v5.0)                     60           54            63

MultiCASE (A2I)                          59                          61                   57
MultiCASE (SOS)                          64                          64                   64
  Leadscope†                             74                          65                   83

 DEREK (v5.0)                            41                         100                   0



 † Selected Leadscope fingerprints were combined with scaffolds and 8 properties.
 Logistic PLS method (50 factors) was used after selecting features – Preliminary Data.
         Improving Bad Apples
You have a positive assessment, now what?
   Correct Molecular Context?
       – Supporting data?
   Interpolating or Extrapolating?
       – Is compound within model’s scope?
   Mechanistic Support?
       – Does the biochemistry make sense?
   Confirmatory Assay
       – Positive
             Develop with caution
       – Negative
             Feed data back into model(s)
          Predictive Toxicology

Do you deliver this to the desktop?
  You don’t (consult a modeler)
      – Confusing results, many interpretations
      – Conflicting goals (show stopper vs. prioritizier)
  Each scientist has a copy on their PC
      – Makes sense for a small client base
  A web-based tool
      – Makes sense for many users
      – Centralized access/control for models and data
          Informatics Infrastructure

                                 “omics”
                               Proprietary
                                   HTS
                                Databases    NCBI/NLM
                                               ToxNet
                                             Commercial
     Data Capture
                                               RTECS
                                              Databases
Connect Directly to:
      Chemistry
                                                ACD
  Assay Data
  Calculated Properties     Your Friendly
                               Focused
           Biology
  Validation Data           Neighborhood         Present
                                                Predictive to
                               Analysis
Real Time Data                Modeler           the User in
                                                  Tools
       Toxicology
Availability!                                    a Usable
                                                  Form!

                          QSAR; Expert System
                           Model Development
Virtual Toxicology Portal
Sample Assessment

                All contain the Aromatic
                Amine Substructure:

                 True Positive

                 False Positive

                 True Negative

                 False Negative

                Not done yet! Need more
                information on this
                substructure – inline QSARs
              Virtual Toxicology Portal
     Assessment & Rule Example




                                       Explore Mutagenic
                                          Mechanism

                       Drill-down on
                       Assessments      Explore Validation
                                         & Related Data
Input Structure(s)                      Generate Reports
Annotation of Substructural Alerts

 95 mutagenicity alerts annotated
   76 Native DEREK mutagenicity alerts
    6 reclassified carcinogenicity alerts (genotoxic mechanism)
    13 alerts Implemented by BMS
    ~300 DEREK Literature References Extracted, Archived and
   Summarized
    Probable mechanism(s), including reactive intermediates,
   described
    Additional SARs & mechanisms derived using publicly available
   data (TOXNET, RTECS, NTP)
    Updated literature archived, integrated and summarized
         300+ additional references
    Lessons learned from QSARs included
    Validation Statistics included
         Other Enhancements

Additional models added:
   HERG binding

   Hepatotoxicity

   EH&S

   MultiCASE (off the shelf and in-house)

   Predicted Physical Properties

Additional data available:
   Proprietary assay data

   Public domain data
              Lessons Learned

•   10% of mutagenic substructures not covered by
    validation set (environmental mutagens, not drug-like)

•   BMS improvements have ~10% increase in positive
    accuracy

•   Compounds with multiple substructures are no more
    likely to be mutagenic

•   No correlation between the number of references and
    accuracy of a substructural alert

•   Has dramatically impacted candidate development

•   Data hungry…the more data it sees, the better it gets!
             Acknowledgements
Discovery Safety Evaluation (BMS)
       Stephen Durham               Laura Custer
Preclinical Candidate Optimization (BMS)
       Greg Pearl (IBM)             Scott Biller (Novartis)
       Donna Dambach                Oliver Flint
       Oneal Puri                   Jennifer Price
       Bruce Car                    Sondra Livingston-Carr
       Griff Humphreys              Leslie Sheppard
Informatics (BMS)
       Larry Allen                  George Goldsmith
       David Saul (Quodlibet)       Dave Benham (Quodlibet)
Molecular Biosciences (BMS)
       Deborah Loughney             Terry Stouch (Lexicon)
       Malcolm Davis                Stephen Johnson
Penn State University
       Peter Jurs                   Gregory Kauffman
       Phil Mosier                  Brian Mattioni
Discovery Chemistry (BMS)
       Nick Meanwell                Joe Tino
Leadscope
       Lofty Lucas                  Chihae Yang
Molecular Systems (Merck)

								
To top
;