mega4 by SBMirza



           Asma Ashraf




Materials and methods:


We announce the release of the fourth version of MEGA software, which expands on the existing
facilities for editing DNA sequence data from autosequencers, mining Web-databases, performing
automatic and manual sequence alignment, analyzing sequence alignments to estimate evolutionary
distances, inferring phylogenetic trees, and testing evolutionary hypotheses. Version 4 includes a unique
facility to generate captions, written in figure legend format, in order to provide natural language
escriptions of the models and methods used in the analyses. This facility aims to promote a better
understanding of the underlying assumptions used in analyses, and of the results generated. Another
new feature is the Maximum Composite Likelihood (MCL) method for estimating evolutionary distances
between all pairs of sequences simultaneously, with and without incorporating rate variation among
sites and substitution pattern heterogeneities among lineages. This MCL method also can be used to
estimate transition/transversion bias and nucleotide substitution pattern without knowledge of the
phylogenetic tree. This new version is a native 32-bit Windows application with multi-threading and
multi-user supports, and it is also available to run in a Linux desktop environment (via the Wine
compatibility layer) and on Intel-based Macintosh computers under the Parallels program. The
currentversion of MEGA is available free of charge at


Clustal is a widely used multiple sequence alignment computer program.The latest version is There are
two main variations:

        ClustalW: command line interface
        ClustalX: This version has a graphical user interface.[3] It is available for Windows, Mac OS, and

    This program accepts a wide range on input format. Included NBRF/PIR, FASTA, EMBL/Swissprot,
    Clustal, GCC/MSF, GCG9 RSF, and GDE.The output format can be one or many of the following:


I. Confirm similarity to one or more previously characterized Hops using BLAST analysis.

             1. Conduct BLASTP analyses to determine whether a given protein is similar to any
                previously characterized Hops.
             2. If there IS NO significant similarity to one or more previously characterized Hops, and
                and the newly identified Hop has been confirmed by criteria other than sequence
                similarity (see Criteria for Hop name assignment), go to Name structure and Selection:
                How to name a Hop for guidelines on naming novel Hop proteins.

1314,1312,1303                                                                                     Page 1

            3. If there IS significant similarity to one or more previously characterized Hops (roughly
               defined as a BLAST expect value of less than 10-5 and with alignment extending over
               60% of the length of the protein) follow the steps below to assign a subgroup
               classification, or contact the PPI site administrator and a subgroup classification can be
               generated for you.

II. Sequence alignment in ClustalW

A. Obtain a file listing all sequences in the Hop family of interest

            1. Go to the list of assigned Hop names and obtain the sequences for all members of the
               appropriate family by clicking on the family name.
            2. Save the list of sequences as a text file. Note that the sequences are listed in FASTA
            3. Add the sequence of the newly identified Hop to the list (also in FASTA format) and save
               the file.

B. Generate alignment file

ClustalW is a general purpose tool for alignment of multiple sequences. It is available through numerous
websites, including the EMBL-EBI site described here. (The HopA family is used to illustrate the various

        Go to ClustalW at EMBL-EBI. The following window will appear

    1. ClustalW will generate the files listed in the window below. Click on the link to the sequence
       alignment file (designated by an .aln extension). Save the .aln file as a text file, renaming it if

1314,1312,1303                                                                                     Page 2

III.     Clustering  analysis    and       genetic        distance     calculation    in     MEGA.
MEGA (Molecular Evolutionary Genetics Analysis) is a free software package for comparative sequence

       A. Download and install MEGA

          1. Go to the MEGA 4 download site, provide the requested information, and download the
             program. An .exe file will appear on your desktop.
   2. Open the .exe file and follow the instructions for installation of MEGA4.
      B. Convert Clustal alignment file to MEGA format
          3. Open the installed MEGA program. The following window will appear

           4. Under File on the menu select "Convert to MEGA format". The following window will

1314,1312,1303                                                                              Page 3

         5. Select the output file you saved from ClustalW as "Data file to convert:" and for "Data
            format"           select           ".aln            (CLUSTAL)".         Click         OK.
            The converted file will appear in the "Text File Editor and Format Converter" window.
         6. Scroll      through        the       converted         file      to    check      format.
            If line numbers are present, either manually remove them or return to ClustalW and
            generate        a        new        .aln       file        without    line      numbers.
            If any extraneous symbols are present following the last sequence, delete them
         7. Save the output file with a .meg extension

      C. Perform clustering analysis in MEGA

         8. Return to the initial MEGA window (shown above in III.B.1) and select "click me to
             activate a data file".
         9. Select the .meg file that you saved in step III.B.5. An "Input Data" window will appear.
             Under "Data Type" select "Protein Sequences" and click OK.
         10. The window shown below will appear, with the open data file indicated at the bottom

         11. Generate a phylogenetic tree from the active data file using Phylogeny > Bootstrap Test
             of Phylogeny.

             At this point, a number of options can be selected, including UPGMA Tree, Neighbor-
             Joining Tree, Minimum Evolution and Maximum Parsimony. Similarly, in the "Analysis
             Preferences" window for UPGMA, Neighbor-Joining, or Minimum Evolution, models can
             be selected under Models>Amino Acid. Users are encouraged to generate trees using a
             variety of these options. The general clustering patterns should be similar, regardless of

             The method that best approximates those described in Lindeberg et al, 2005 (using
             MEGA 4 rather than MEGA 2.1) involves Neighbor-Joining using the Amino Acid: p-
             distance model (the "Analysis Preferences" window for Neighbor-Joining is shown

1314,1312,1303                                                                                 Page 4


          The Bootstrap consensus tree resulting from Neighbor-Joining analysis for the HopA family is

          shown below.


          Genetic distance between the new Hop and established subgroups is Calculated.

1.Gojobori T, Li WH, Graur D. 1982. Patterns of nucleotide substitution in pseudogenes and functional
genes. J Mol Evol. 18:360–369.
2.Hall BG. Phylogenetic trees made easy: A how-to manual. Sunderland (MA): Sinauer Associates.
3.Kumar S, Dudley J. 2007. Bioinformatics for biologists in the genomics era. Bioinformatics.
10.1093/bioinformatics/ btm239.
4.Kumar S, Tamura K, Nei M. 2004. MEGA3: an integrated software for Molecular Evolutionary Genetics
Analysis and sequence alignment. Brief Bioinform. 5:150–163.
5.Saitou N, Nei M. 1987. The Neighbor-Joining method—a new
method for reconstructing phylogenetic trees. Mol Biol Evol. 4:406–425.
6.Tamura K, Nei M. 1993. Estimation of the number of nucleotide substitutions in the control region of
mitochondrial- DNA in humans and chimpanzees. Mol Biol Evol. 10: 512–526.
7.Tamura K, Nei M, Kumar S. 2004. Prospects for inferring very large phylogenies by using the Neighbor-
Joining method. Proc Natl Acad Sci USA. 101:11030–11035.

1314,1312,1303                                                                                  Page 5

8.Thompson JD, Higgins DG, Gibson TJ. 1994. ClustalW— improving the sensitivity of progressive
multiple sequence alignment through sequence weighting, position-specific gap penalties and weight
matrix choice. Nucleic Acids Res.

1314,1312,1303                                                                             Page 6

To top