Method for Modeling and Docking
3D model building
The initial model of Amiloride sensing Cation channel 2, neuronal (ASCC2N) was built by
using homology-modeling methods and the MODELLER software; a program for comparative
protein structure modeling optimally satisfying spatial restraints derived from the alignment and
expressed as probability density functions (pdfs) for the features restrained. The pdfs restrain C α-Cα
distances, main-chain N-O distances, and main-chain and side-chain dihedral angles. The 3D model
of a protein is obtained by optimization of the molecular pdf such that the model violates the input
restraints as little as possible. The molecular pdf is derived as a combination of pdfs restraining
individual spatial features of the whole molecule.
The optimization procedure is a variable target function method that applies the conjugate
gradients algorithm to positions of all non-hydrogen atoms. The query sequence from Homo sapiens
was submitted to SBASE server ASCC2N prediction. The predicted domain was searched to find out
the related protein structure to be used as a template by the BLAST (Basic Local Alignment Search
Tool) program against PDB (Protein Databank). Sequence that showed maximum identity with high
score and less e-value was aligned and was used as a reference structure to build a 3D model for
ASCC2N. The sequence of ASCC2N (P78348) was obtained from NCBI.
The co-ordinates for the structurally conserved regions (SCRs) for ASCC2N were assigned
from the template using multiple sequence alignment, based on the Needleman-Wunsch algorithm.
The structure having the least modeler objective function, obtained from the modeler was improved
by molecular dynamics and equilibration methods using NAMD 2.5 software using CHARMM++
force field for lipids and proteins along with the TIP3P model for water. The energy of the structure
was minimized with 1, 00, 00 steps. A cutoff of 12 Å (switching function starting at 10 Å) for van der
Waals interactions was assumed. No periodic boundary conditions were included in this study. An
integration time step of 2 fs was used, permitting a multiple time-stepping algorithm to be employed
in which interactions involving covalent bonds were computed every time step, short-range
nonbonded interactions were computed every two time steps and long-range electrostatic forces were
computed every four time steps.
The pair list of the nonbonded interaction was recalculated every ten time steps with a pair list
distance of 13.5 Å. The short-range nonbonded interactions were defined as van der Waals and
electrostatics interactions between particles within 12 Å. A smoothing function was employed for the
van der Waals interactions at a distance of 10 Å. CHARMM27 [force-field parameters were used in
all simulations in this study. The equilibrated system was simulated for 1 ps with a 500 kcal/mol/Å2
restraint on the protein backbone under 1 atm constant pressure and 310 K constant temperature
(NPT) and the Langevin damping coefficient was set to 5 ps unless otherwise stated. Finally, the
structure having the least energy with low RMSD (Root Mean Square Deviation) was used for further
studies. In this step, the quality of the initial model was improved. The final structure obtained was
analyzed by Ramachandran’s map using PROCHECK (Programs to check the Stereo chemical
Quality of Protein Structures) and environment profile using ERRAT graph (Structure Evaluation
server). This model was used for the identification of active site and for docking of the substrate with
the enzyme.
7.2 Active site Identification
Active site of ASCC2N was identified using CASTp server. A new program, CASTp, for
automatically locating and measuring protein pockets and cavities, is based on precise computational
geometry methods, including alpha shape and discrete flow theory. CASTp identifies and measures
pockets and pocket mouth openings, as well as cavities. The program specifies the atoms lining
pockets, pocket openings, and buried cavities; the volume and area of pockets and cavities; and the
area and circumference of mouth openings.
7.3 Docking method
GOLD method
1. Docking program
In this project, we used GOLD (Genetic Optimization for Ligand Docking) program to
perform ligand-protein docking. GOLD is an automated ligand docking program that uses
genetic algorithm (GA) to explore the full range of conformational flexible ligand with partial
flexible protein, and find an optimized one with fundamental requirement.
2. Initialization of the protein ASCC2N and of the ligands
Protein structure ASCC2N was modeled and checked by procheck and result shows its
structure is qualified. Once all atom types and bond types are correct, hydrogen atoms are placed
on the corrected atoms. The torsion angles of amino acids like Asp, Tyr and Leu hydroxyl groups
will be optimized by GOLD so their positions do not matter. Specifically, each Asp, Tyr and Leu
OH will be allowed to rotate to optimize its hydrogen-bonding to the ligand. Lysine NH3+
groups are similarly optimized, unless they are fixed by strong H-bonds to neighboring protein
residues. Local minimization is then performed in the presence of restraints to relieve potential
bad contacts, at the same time maintaining the protein conformation very close to that observed
in the crystallographic model.
We convert 2D description (from a molecule sketching program ChemSketch, Discovery
studio) of the ligand into 3D coordinates. All the hydrogen atoms are added to make sure all the
valencies of the heavy atoms are satisfied. Then we check them manually. If they are correct,
GOLD will deduce atom type automatically when atom typing is turned on. LD ignores atom
charges, both formal and partial. Therefore, we don’t need to care about the charge of ligands.
At last, we save the ligands as MOL2 file.
3. Docking in GOLD
The GA settings directly affect running times and the likelihood of finding the global
optimum. The main parameters that affect running time and accuracy are the number of dockings
and the number of GA operations in each docking. According to the study f. Jones et al., the
outcomes obtained with 2,5,10, and 20 GA, and the cited authors conclude that GOLD generally
requires less than 20 GA runs to reproduce a binding mode. In our study, we chose to run the
GOLD dockings with 10 GA run and 15 run, and the results are nearly same. Therefore, we use
the 10 GA runs. Other parameters are listed in the followed table.
In GOLD, the ligand is initially placed in the protein binding site on the basis of fitting
points; then, a GA (Genetic algorithm) is used to explore the conformational space of the ligands
on the basis of fitness scores. The initial population of possible ligand poses is set up at random.
Each member of the population is encoded as a chromosome, which contain information on the
binding between protein and ligand.
4. Setting potential binding pockets
Because the binding site is not known, it is necessary to select some regions in the protein
as being the most likely candidates to serve as the protein binding region. The result shows there
are 14 binding sites in ASCC2N. Therefore, we define each predicted binding sites from a list of
residues to measure the affinity of ligands to ASCC2N. By doing this, we can largely reduce the
running time and increase the efficiency of probe. Through comparing the fitness score, we can
find the potential binding pockets.
5. Ligand flexibility
During ligand initialization, ligand amide group will be set to the trans conformation. In
fact, the amide bond in ligands appears to flip between planar-cis and planar-trans on binding to
the protein. Therefore, we define that the amide bond can be flipped. Also we define corner
flipping if the ligand has an un-aromatic ring.
6. Protein flexibility
Because GOLD just treats the O-H and NH3 groups of side chain of residue flexible, in
fact, side chain is also flexible. In this project, we define certain side chain will be allowed to
undergo torsional rotation around one or more of its acyclic bonds.
7. Setting constraints
After doing a series of docking operations, we can find that a certain atom in protein
hydrogen bond can form a hydrogen bond with ligand. Therefore, we specify that a particular
protein atom should be hydrogen-bonded to the ligand. Thus, GOLD will biased towards finding
solution in which the specified constraint is satisfied. As a result, we can optimize the GOLD
solution.
8. Gold Score fitness function:
Gold Score performs a force field based scoring function and is made up of four
components:
1) protein-ligand hydrogen bond energy (external H-bond)
2) protein-ligand van der Waals (vdw) energy (external vdw);
3) ligand internal vdw energy (internal vdw);
4) ligand torsional strain energy (internal torsion)
The fitness score is taken as the negative of the sum of the component energy terms, so larger fitness
scores are better. The external vdw score is multiplied by a factor of 1.375 when the total fitness
score is computed. This is an empirical correction to encourage protein-ligand hydrophobic contact.
The fitness function has been optimized for the prediction of ligand binding positions.
GoldScore = S (hb_ext) + S (vdw_ext) + S (hb_int) + S (vdw_int)
Where S (hb_ext) is the protein-ligand hydrogen bond score, S (vdw_ext) is the protein-ligand van
der Waals score, S (hb_int) is the score from intramolecular hydrogen bond in the ligand and S
(vdw_int) is the score from intramolecular strain in the ligand.
Docking is done by using GOLD software, it requires two files.
1. Ligand File.
2. Receptor File.