FBSS : Using Molecular Fields as a Novel Alignment Method for 3D QSAR
Valerie Gillet1, Nick Jewell 1, Graham J. Sexton 2, David B. Turner 1 and Peter Willett1
University of University of
GA Parameter GA Value
The Field-Based Similarity Searcher (hereafter known as FBSS) has been GA Operations 10000 Melatonin Receptor Antagonists :
developed by researchers at the University of Sheffield. FBSS can align GA Population Size 125
series of molecules by optimising the molecular field overlap between the GA Selection Pressure 1.10
A CoMFA Model Developed Using
compounds using a Genetic Algorithm (GA). Previous work has focussed Flexibility RIGID and FIX-FLEX FBSS Alignments
on the alignment of structures to assess three-dimensional molecular (see text)
Initial Population Randomly Generated Melatonin is the principal hormone of the vertebrate pineal gland in
similarity in database searching, with results in this matter being published
and Diverse humans. The production cycle within the body correlates with
(using Hamming Distance) conditions of light and darkness surrounding the body, with high
concentrations being produced during periods of darkness, for example
Comparative Molecular Field Analysis (hereafter, CoMFA)  has
Figure 2: Basic FBSS Parameters at night. Melatonin inhibitors are finding new applications in the field
become one of the most popular methods to develop three-dimensional
of anti-cancer treatments, where the compounds appear to act as anti- Figure 8: Steric and Electrostatic (coefficient*stdev) plots
quantitative structure-activity relationships (3D-QSAR). The basic
CoMFA method is shown in Figure 1 below, although it is relatively Testing the Relationship between Q2 Figure 4: Electrostatic and Steric (coefficient*stdev) contour
tumour agents. for the FBSS Alignment
common for researchers to alter key aspects of the CoMFA process, such (Contour levels of 10.0 and 90.0 are used)
as the calculation of novel molecular fields or the adoption of new
and Molecular Similarity plots for the CBG Steroids after FBSS Alignment A previous 3D-QSAR experiment conducted by Sicsic et al. 
(Hydrogen Suppressed). Default contour values are used. utilised a training set of 48 compounds, all of which show some The manual alignment and CoMFA model presented here are the results
The main assumption underlying this work is that by antagonist potency towards the melatonin receptor. The training set of testing many different alignment rules. This manually-derived
aligning compounds based on their molecular similarity to was built from a variety of rigid core structures which are summarised
1. Alignment of structures (either template-, functional group- or ‘optimum’ model performs as robustly (as indicated by Q2) and more
a target compound, and then by using that alignment to in Table 3.
field-based). predictively (as indicated by pr-R2) than the FBSS-derived model. But
generate a CoMFA model, better Q2 values will be Chemical Family of Number of Training Compound Notes (see below) the major advantage of using the FBSS procedure is that it proceeds
obtained. Core Structure Set Compounds in Reference Numbers
2. Placement of structures within a molecular lattice. Class from  automatically and produces a very similar model (in terms of both
By running FBSS with various GA settings, we are able to Indole-based 9 1-9 * statistics and suggesting direction for further drug design).
Naphthalene-based 23 10-32 *
3. Sampling of electrostatic and steric molecular fields at grid generate a series of molecular alignments that vary in their Tricyclic 2 33,34 +
intersections within the lattice. precision. As part of the software’s output, a value for the Tetraline-based
mean similarity between compounds and the target
4. Statistical analysis (usually Partial Least Squares) producing a structure is obtained. This can be compared to the Q2 of the Notes: Compounds with (*) contain a highly-flexible
descriptive model from the molecular field data. CoMFA model generated from the overlay. For the ethylamido side chain on the aryl-group
benchmark steroid data set, Figure 3 shows a graph of the Compounds with (+) contain the ethylamido group The FBSS program can align any set of structures for which there is
5. Analysis of model, including the removal of outliers and relationship between these values; the squared correlation bound directly to the cycle the potential for a CoMFA study.
application to test sets. coefficient (R2) is 0.82. Such a high correlation suggests Figure 5: FBSS Alignment of Steroids
that there is a significant relationship between molecular (Hydrogen-Suppressed) Table 3: Classification of Melatonin Receptor The field-based alignment generates statistical models and drug-
Figure 1: The Basic CoMFA Method similarity and quality of the derived PLS model. Antagonists
Grid Optimum SECV Q2
design ideas that are consistent with those obtained using manual
Spacing/Å Components Sicsic et al. explored many alignment paradigms before developing an alignment procedures. It can be argued that a user would not want
It is proposed that FBSS can be used to align chemical structures as an 0.9
1.0 3 0.466 0.866 optimum CoMFA model. The FBSS procedure used compound 12 as a FBSS to generate an overlay identical (or as close to manual) as
Cross-Validated R2 (CoMFA)
initial step in a CoMFA analysis. The experiments discussed in this paper 0.7
reference, as it was the most active inhibitor in the series and in the possible, but rather a non-obvious alternative overlay. However,
relate to the application of FBSS to aligning structures from previous
R2 F SE pr-R2
absence of any additional information may be the best starting point for where manual alignment is difficult or not possible, the results here
0.4 0.982 233.102 0.170 0.917
CoMFA experiments. In each case, models have been generated using 0.3
field-based alignment. The best literature model removed compounds suggest that FBSS is a viable alternative.
procedures taken from the original source (including, where possible, 0.1 14,18,47 and 48 as outliers for reasons explained within the original
provided alignments) as well as from FBSS-generated superpositions. The 0 Table 1: CoMFA Model from FBSS Alignment
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 paper. Our model was developed from co-ordinates supplied by the
models have been evaluated in terms of their robustness (given by Q2) and Mean Molecular Similarity
Although it is straightforward to manually align the steroids, it is common practice to authors in which these compounds were already omitted. The test set of Future work
their predictivity (given by the predictive R2 when applied to the test set). Figure 3: FBSS Mean Molecular Similarity plotted 9 structures was taken from across the range of chemical families
test any new 3D-QSAR technique using this data set. If the new method fails to We will study data sets that have no obvious manual alignment. In
against CoMFA model’s Cross-Validated R2 for FBSS produce a significant model using such simple data and alignment techniques, then it expressed in the training set. these cases, we hope FBSS will produce a useful overlay for direct
Alignment may not be worth pursuing with further data sets. use in a 3D-QSAR method or will suggest ideas for a potential
The statistics from the original CoMFA model are given in Table 4, manual alignment.
Calculation of Field-Based Overlap CoMFA Methods A manual alignment of the CBG set provides the CoMFA model given in Table 2. while the CoMFA model generated by the FBSS alignment is presented
Graphical examples of the fields that can be generated through FBSS are presented in in Table 5. Rigorous aggregate reorientation was performed on the Incorporation of flexibility into the GA will be realised with the
and Similarities CoMFA was performed using the QSAR and Advanced Figure 6. manual alignment using a 1Å grid spacing, as the original CoMFA introduction of low-energy conformers through systematic or
CoMFA modules contained within SYBYL version 6.4 . model was described using a coarse 2Å lattice-spacing. random conformational searching algorithms.
The process of calculating the molecular fields within FBSS using Except where directly specified the default CoMFA Grid Optimum SECV Q2
approximate Gaussian functions, and the subsequent determination of parameters provided with the software have been applied. Spacing/Å Components
1.0 2 0.450 0.870 Grid Optimum SECV Q2
molecular similarity using the Carbo index are detailed in an earlier Spacing/Å Components
publication by this group . Corticosteroid-Binding Globulin 1.0 5 0.613 0.720 Important Note: FBSS results were compiled with no further
R2 F SE pr-R2
0.930 126.4 0.320 0.840 attempts to optimise the Q2 value produced from the CoMFA model,
Ligands : A CoMFA Model R2 F SE pr-R2
through modification of the lattice surrounding the superposed
Table 2: CoMFA Model from the Manual Alignment 0.930 222 0.248 0.730
Developed Using FBSS Alignments molecules, or variable selection procedures such as GOLPE  :
Table 4: CoMFA Model J from Original Melatonin procedures often employed in a CoMFA analysis.
CoMFA Paper  (after aggregate reorientation in a lattice
The CBG steroid set , was the first set of structures used
with a 1.0 Å grid spacing).
to validate the CoMFA method, and are often referenced as
FBSS Parameters for Molecular a benchmark against which new methods of analysis in 3D References
Alignment QSAR can be compared. The data set consists of 31
Grid Optimum SECV Q2
steroids that show binding affinity toward both Spacing/Å Components 1. Drayton, S.K.; Edwards, K.; Jewell, N.E.; Turner, D.B.; Wild, D.J.; Willett, P.;
testosterone-binding globulin (TeBG) and corticosteroid- 1.0 5 0.704 0.717 Wright, P.M. and Simmons, K. in www.ijc.com/articles/1998v1/37/
The FBSS algorithm allows the alignment of structures based on the binding globulin (CBG). The training set consists of 21 2. Cramer, R.D., III; Patterson, D.E.; Bunce, J.D. J.Am.Chem.Soc. 1988, 110, 5959-
R2 F SE pr-R2 5967.
overlap of electrostatic, steric or lipophilic fields. In addition, these fields structures, with an additional 10 compounds reserved for 3. Quantum Chemistry Program Exchange, Creative Arts Building 181, Indiana
0.981 407 0.179 0.547
can be used in any combination. An extensive phase of testing has led us the test set (although these compounds are provided with University, Bloomington, Indiana 47405, USA.
to believe that, where other guiding factors are not present, the best only CBG binding data). The popularity of this data set has Table 5: CoMFA Model using FBSS-derived Alignment 4. Tripos, Inc. 1699 South Hanley Road, St. Louis, Missouri 63144-2913, USA.
results are obtained by applying FBSS with all three fields simultaneously 5. Sicsic, S.; Serraz, I.; Andrieux, J.; Bremont, B.; Mathe-Allainmat, M.; Poncet, A.;
led to its inclusion in the SYBYL software release, as part
Shen, S.; Langlois, M. J.Med.Chem. 1997, 40, 5, 739-748.
(hereafter known as ‘All Fields’). In such a case, the resulting similarity is of the CoMFA tutorial. 6. Baroni, M.; Costantino, G.; Cruciani, G.; Riganelli, D.; Valigi, R.; Clementi, S.
The CoMFA models for both the manual and FBSS-derived
taken from an average of all the fields and their overlap. alignments are displayed as contour plots within SYBYL. It is Quant.Struct-Act. Relat. 1993, 12, 9-20.
In the original literature, one of the ligands with the highest interesting to note that similar regions of CoMFA field are displayed
The GA parameters used to derive the alignments are shown in Figure 2. CBG affinity (deoxycortisol) was used as the template for each type of plot, demonstrating consistency between the models 1. Krebs Institute for Biomolecular Research
It should be noted that FBSS was run 75 consecutive times, with the best molecule. The superposition was based on a least-squares and validating the FBSS alignment. The Sicsic et al. model is shown Department of Information Studies
single result (determined as the orientation which generates the greatest fit onto the carbon atoms of the template within the steran in Figure 7 and the FBSS-aligned CoMFA model in Figure 8. University of Sheffield
molecular similarity between each pair of compounds) being reported. In skeleton. The FBSS alignments (Figure 5) were also Western Bank, Sheffield, UK
database searching, FBSS is normally only applied once per compound generated using deoxycortisol as a reference structure. The S10 2TN
due to time constraints. However, with many fewer structures considered steroid compounds (after thorough checking and structure-
in a CoMFA experiment, multiple runs are possible. 2. Zeneca Agrochemicals
correction by members of the group) were assigned semi- Jealotts Hill Research Station
empirical MOPAC  PM3 charges and aligned to the Bracknell, Berkshire, UK
The GA has the capacity for flexible searching, using simple reference structure using the GA. During the prediction of RG42 6ET
conformational rules encoded within the chromosome. It has been found the test set activities from the PLS model, compound 31
that the RIGID methods provide the best CoMFA models, provided that was omitted from the analysis as an outlier (this has been Acknowledgements
the initial structures are of suitable conformation; attempts to incorporate done by a number of groups). This work was funded by a BBSRC CASE award
flexibility are ongoing, as noted in the conclusions. with Zeneca Agrochemicals.
Contour plots summarising CoMFA PLS results using Figure 7: Steric and Electrostatic (coefficient*stdev) plots
FBSS-derived alingments are given in Figures 4 and 5. The Thanks also to Dr. David Wilton and Dr. Ansgar Schuffenhauer for
Figure 6: FBSS Electrostatic (top left), Steric (top right) and Hydrophobic (bottom) for the Manual Alignment useful input at all stages
statistical details are provided in Table 1. molecular fields for Aldosterone, an example of the steroid training set (Contour levels of 10.0 and 90.0 are used) of this project.