A flexible and simple structure enumeration tool – Accord for Excel CombiChem add-on

Deqi Chen Alec Westley

High throughput screening (HTS) has been used widely in the pharmaceutical industry during the lead discovery stage. Meanwhile, virtual screening has also been gaining momentum and is used more and more widely, both for point for the whole process. To this end, combinatorial chemistry has provided fast access to a vast number of compounds for wet lab and virtual screening.

receptor-based and ligand-based virtual screening. In any case, a physical or virtual compound library is the starting

Normally, there are two methods used to build a combinatorial library: reaction-based and Markush structure-based. Medicinal chemists most often use reaction-based methodology to assemble a compound library or prepare parallel syntheses. Conversely, to build a virtual library, Markush structure-based approach has been used more frequently. Based on Accord Chemistry SDK (Software Development Kit), Accelrys provides several easy solutions for the

compound library enumeration, for both reaction- and Markush structure-based. Among them, the simplest solution 64,000 rows of data. Accelrys also provides CombiChem Enumeration Component (CEC), which has the capability for

is Accord for Excel CombiChem add-on, which focuses on small library enumeration fitting within Excel’s limitation of building a million-compound library within a matter of minutes. Using CEC is a more advanced approach, and it’s very easy to build your own graphical user interface (GUI) front end with this component, which will be the subject of the second part of this application note.

To enumerate a compound library as diversely as possible, it is essential to have a reliable resource and a good

compound/reagent database. To ensure this, Accelrys provides such a database, Compounds Available for Purchase

(CAP), which contains ~3.1 million unique structures (release 2006.1). The CAP database comes in two formats: CAP

reagents and CAP complete, which combines the CAP reagents and screening compounds. This database provides a and property/vendors. In the end, the query result can be exported in many formats, for example, SD files.

simple, easy-to-use, and thematic approach to allow searches based on structural (substructure, exact and similarity)

We will demonstrate herein the workflow from selecting reagents, finding the sources, to enumerate the compound library.

Library Enumeration
In a recent publication1, scientists from Pfizer reported a study on the structure-activity relationship of quinuclidine benzamides as agonists of a-7 nicotinic acetylcholine receptors by testing a library of benzamides.

To get the appropriately substituted benzoic acids and amines, we used the CAP reagent database. As we were looking for substituted benzoic acid at positions 3 and/or 4, we were able to use the flexible Markush query for getting the most informative result in a single query.


Above is the search result in Accord for Excel Enterprise (Figure 1). After browsing through the hits and selecting the available amine was found and exported as well. Using those SD files, the reaction-based enumeration proceeded easily (Figure 2).

desired benzoic acid, we were able to easily export a SD file directly from the Excel sheet. Similarly, the commercially

Accord for Excel CombiChem add-on has a built-in wizard that makes the job of enumerating of a manageable

combinatorial library much more easy and flexible. The advantage of this lies in the Accelrys Accord Chemistry Engine. It can deal with many complicated situations, like single reaction site versus multiple reaction sites, at the same time. Accord will preserve the stereochemistry at the point of attachment, even if it is generated during the reaction, and it either the reaction or product. can handle both reaction and Markush-based enumeration. Also, Accord enumeration can produce the final results in

In another example, for the synthesis of a library of hydoxyproline derivatives2, chemists would have to carry out

reactions in multiple steps in the laboratory. Accord for Excel CombiChem add-on is ideal for this type of reaction-

based enumeration, even if it involves one single reactant. The Excel workbook will contain each step of the reaction products). Please note that during the Mitsunobu reaction, the stereochemistry of the carbon at the 3- position has been reversed and the CombiChem module handled this very well (Figure 4).

for the whole process. In Figure 3, the final results of the five-step reaction sequence are shown (10,200 enumeration


On the other hand, if a chemist only needs a summary sheet that shows the overall enumeration process, Accord for Excel CombiChem add-on is capable of doing Markush type of enumeration, and more importantly, the stereochemistry of the product will be preserved (Figure 5).

This example demonstrates the flexibility of the CombiChem add-on. Such flexibility allows it to be used step-by-step with multiple Excel sheets, to be run as a single step reaction using the Markush capability, or to be used for parallel synthesis. In addition, because Accord for Excel has many built-in property calculations, ADME/T predictions, and even user-defined calculations, one can use its filter and sorting function to assist library profiling. As an example shown in Figure 6, AlogP, Lipinski’s Rule of Five, blood-brain barrier penetration, and CY P450 2D6 toxicity have been calculated at once.

For a full list of available calculations, please see http://www.accelrys.com/products/datasheets/acc_excel_data.pdf

Another advantage of this module is that the program recognizes multiple functional groups within a molecule. medicinal chemists commonly use (H)OCH2CH2NH2 (hydroxyethylamino group) or its repeating unit as building

One of the most commonly used methods to build a C-N bond is to react an amine with an alkylhalide. In addition, block to build a link or try to improve lipophilicity. In a wet laboratory, chemists can react an alkylhalide selectively with a primary amine in the presence of the secondary amine on the hydroxyethylamino group by controlling the reaction conditions. But in the software-based enumeration process, it is always a challenge to distinguish between them without interrupting the enumeration process. The Accord for Excel CombiChem add-on handles this type of occur on both the primary and the secondary amine. It will allow the user to decide which one will be involved in one for both), the end results will be exactly what chemists manage experimentally in the laboratory (Figure 7). situation extremely well. The software can recognize that for this particular structure, the reaction could potentially the reaction, in this case the primary amine. After removing two other possibilities (one for secondary amine and


Another useful feature is the exclusion. If there are structures containing certain functional groups that

you do not want to include in the enumeration process, it can use the match column function, based on the

substructure match, to filter off those structures. For example, if there is an alpha-amino carbonyl group present, and it is desirable to exclude this type of substructure, one can add the match column and do a substructure search based on the alpha-amino carbonyl group and choose the filter to FALSE. As a result, all structures with an alpha-amino carbonyl group will be excluded in the following enumeration process (Figure 8).

Accord for Excel CombiChem add-on provides a simple and flexible enumeration tool. Although it is suited to smaller

datasets dictated by Microsoft Excel capacity, it serves very well for enumeration of a manageable sized library for HTS and vHTS. In addition to a traditional reaction based one, the Markush based enumeration further extends the usage of this tool. Taking advantage of available Accord for Excel functions, like property calculations, ADME/T predictions, LogD and pKa calculation, this product is an ideal tool for any medicinal and/or organic chemists in the field.

