Phylogeny Programs
Here are some 176 of the phylogeny packages, and 16 free servers, that I know about. It is an attempt to be completely comprehensive. I have not made any attempt to exclude programs that do not meet some standard of quality or importance. Updates to these pages are made about twice a year. Some of these programs are available over Internet from ftp server machines, or by World Wide Web. The programs listed below include both free and non-free ones; in some cases I do not know whether a program is free. I have listed as free those that I knew were free; for the others you have to ask their distributor. If you discover any inaccuracies, or feel that I have left any important programs or facts out, or if links do not work properly, please e-mail me (joe@genetics.washington.edu).
List of packages arranged ...
... by methods available ... by computer systems on which they work ... cross-referenced by method and by computer system. ... by ones which analyze particular kinds of data. ... to show the most recent listings ... to show ones most recently changed Phylogeny programs formerly listed here but no longer distributed
http://evolution.genetics.washington.edu/phylip/software.html (1 of 18) [14/11/2000 5:28:06 pm]
Phylogeny Programs
Which kinds of programs are and are not listed
Other lists of phylogeny software
Table of contents by methods available
q q q q q q q q q q q q q q q q q q q q q q q q
General-purpose packages Parsimony programs Distance matrix methods Computation of distances Maximum likelihood and related methods Quartets methods Artificial-intelligence methods Invariants (or Evolutionary Parsimony) methods Interactive tree manipulation Looking for hybridization or recombination events Bootstrapping and other measures of support Compatibility analysis Consensus trees and distances between trees Tree-based alignment Biogeographic analysis and host-parasite comparison Comparative method analysis Simulation of trees or data Examination of shapes of trees Clocks, dating and stratigraphy Description or prediction of data from trees Tree plotting/drawing Sequence management/job submission Teaching about phylogenies Web or e-mail servers that can analyze data for you
General-purpose packages q PHYLIP
q
PAUP*
http://evolution.genetics.washington.edu/phylip/software.html (2 of 18) [14/11/2000 5:28:06 pm]
Phylogeny Programs
q q q q q q q
MEGA VOSTORG Fitch programs Phylo_win ARB DAMBE PAL
Parsimony programs q PAUP*
q q q q q q q q q q q q q q q q q q q q q q q
Hennig86 MEGA Tree Gardener RA Nona PHYLIP TurboTree Freqpars Fitch programs CAFCA Phylo_win sog gmaes LVB GeneTree TAAR ARB DAMBE MALIGN POY DNASEP SEPAL Gambit
Distance matrix methods
http://evolution.genetics.washington.edu/phylip/software.html (3 of 18) [14/11/2000 5:28:06 pm]
Phylogeny Programs
q q q q q q q q q q q q q q q q q q q q q q q q
PHYLIP PAUP*
q
MEGA
q
gmaes DENDRON Molecular Analyst Fingerprinting BIONJ TFPGA MVSP SOTA ARB BIOSYS-2 Darwin T-REX sendbs nneighbor DAMBE weighbor QR2 DNASIS minspnet PAL Arlequin vCEBL
MacT
q
ODEN
q
Fitch programs
q
ABLE
q
TREECON
q
DISPAN
q
RESTSITE
q
NTSYSpc
q
METREE
q
TreePack
q
TreeTree
q
GDA
q
Hadtree, Prepare and Trees
q
Wisconsin Sequence Analysis Package (GCG)
q
SeqPup
q
PHYLTEST
q
Lintre
q
WET
q
Phylo_win
q
njbafd Gambit
Computation of distances
http://evolution.genetics.washington.edu/phylip/software.html (4 of 18) [14/11/2000 5:28:06 pm]
Phylogeny Programs
q q q q q q q q q q q q q q q q q q q q
PHYLIP
q
TFPGA REAP MVSP SOTA RSTCALC Genetix BIOSYS-2 RAPD-PCR package DISTANCE Darwin sendbs K2WuLi GeneStrut Arlequin DAMBE DnaSP PAML puzzleboot MATRIX PAL
PAUP*
q
RAPDistance
q
MULTICOMP
q
MARKOV
q
RSVP
q
Microsat
q
DIPLOMO
q
OSA
q
DISPAN
q
RESTSITE
q
NTSYSpc
q
TREE-PUZZLE
q
Hadtree, Prepare and Trees
q
Wisconsin Sequence Analysis Package (GCG)
q
AMP
q
GCUA
q
DERANGE2
q
POPGENE
q
Maximum likelihood and related methods q PHYLIP
q q q q q q q q q q q
PAUP* fastDNAml MOLPHY PAML Spectrum SplitsTree PLATO SPOT TREE-PUZZLE Hadtree, Prepare and Trees SeqPup
http://evolution.genetics.washington.edu/phylip/software.html (5 of 18) [14/11/2000 5:28:07 pm]
Phylogeny Programs
q q q q q q q q q q q
Phylo_win PASSML ARB Darwin BAMBE DAMBE Modeltest TreeCons VeryfastDNAml PAL dnarates
Quartets methods q TREE-PUZZLE
q q q q q q q q q
STATGEOM SplitsTree PHYLTEST GEOMETRY PICA95 Darwin PhyloQuart Willson quartets programs Gambit
Artificial-intelligence methods q SOTA Invariants (or Evolutionary Parsimony) methods q PHYLIP
q
PAUP*
Interactive tree manipulation q MacClade
q q q q
PHYLIP PDAP TreeTool ARB
http://evolution.genetics.washington.edu/phylip/software.html (6 of 18) [14/11/2000 5:28:07 pm]
Phylogeny Programs
q q q
WINCLADA TreeEdit UO
Looking for hybridization or recombination events q PLATO
q q q q q q q q
Bootscanning Package TOPAL reticulate RecPars partimatrix homoplasy test LARD Network
Bootstrapping and other measures of support q PHYLIP
q q q q q q q q q q q q q q q q q q
PAUP* PARBOOT ABLE Random Cladistics AutoDecay TreeRot RASA DNA Stacks OSA DISPAN TreeTree PHYLTEST Lintre sog njbafd PICA95 TAXEQ2 BIOSYS-2
http://evolution.genetics.washington.edu/phylip/software.html (7 of 18) [14/11/2000 5:28:07 pm]
Phylogeny Programs
q q q q q q q q q q
RAPD-PCR package TreeCons BAMBE DAMBE puzzleboot CodonBootstrap DNASEP SEPAL Gambit MEAWILK
Compatibility analysis q COMPROB
q q q q q q q
PHYLIP PICA95 reticulate partimatrix SECANT CLINCH MEAWILK
Consensus trees and distances between trees q COMPONENT
q q q q q q q q q
TREEMAP NTSYSpc PHYLIP PAUP* REDCON TAXEQ2 TreeCons QUARTET2 RadCon
Tree-based sequence alignment q TreeAlign
q
ClustalW
http://evolution.genetics.washington.edu/phylip/software.html (8 of 18) [14/11/2000 5:28:07 pm]
Phylogeny Programs
q q q q q q q q q
MALIGN GeneDoc Wisconsin Sequence Analysis Package (GCG) TAAR Ctree DAMBE POY ALIGN DNASIS
Biogeographic analysis and host-parasite comparison q COMPONENT
q
TREEMAP
Comparative method analysis q PHYLIP
q q q q q q q q q q q q
CAIC COMPARE PA CMAP CoSta PDAP ACAP ANCML RIND MacroCAIC Fels-Rand Phylogenetic Independence
Simulation of trees or data q COMPONENT
q q q q q
Bi-De SEQEVOLVE TheSiminator Seq-Gen Treevolve and PTreevolve
http://evolution.genetics.washington.edu/phylip/software.html (9 of 18) [14/11/2000 5:28:07 pm]
Phylogeny Programs
q q q q q q
PSeq-Gen COMPARE ROSE PAML ProSeq PAL
Examination of shapes of trees q End-Epi
q
MacroCAIC
Clocks, dating and stratigraphy q StratCon
q q q q q q q q
QDate DIVERSI K2WuLi Modeltest PAML TipDate RRTree vCEBL
Description or prediction of data from trees q CONSERVE
q
TreeDis
Tree plotting/drawing q PHYLIP
q q q q q q q q q
PAUP* TreeTool TreeView Fitch programs NJplot DendroMaker Tree Draw Deck Phylodendron ARB
http://evolution.genetics.washington.edu/phylip/software.html (10 of 18) [14/11/2000 5:28:07 pm]
Phylogeny Programs
q q q q
unrooted DAMBE TREECON Mavric
Sequence management/job submission q PARBOOT
q q q q q q q q q
Random Cladistics Tree Gardener GDE MUST DNA Stacks SeqPup ARB BioEdit Singapore PHYLIP web interface
Teaching about phylogenies q Phylogenetic Investigator
Table of contents by computer systems
on which they work
Unix (source code in C or executables) PC's ... under Windows ... under DOS or in a Windows "DOS box" Macintoshes or PowerMacs VMS executables or C sources with VMS compilation support e-mail or Web servers that can analyze data for you Unix (source code in C or executables). I have included programs that are available as C source
q
http://evolution.genetics.washington.edu/phylip/software.html (11 of 18) [14/11/2000 5:28:08 pm]
Phylogeny Programs
code because most Unix workstations have a C compiler. (A few programs with FORTRAN source code are included too).
r r r r r r r
PHYLIP
r
PAUP*
r
Seq-Gen TreeTool GDE sog TreePack Phylodendron Treevolve and PTreevolve PSeq-Gen njbafd gmaes GCUA DERANGE2 LVB BIONJ TAAR ANCML QDate Bootscanning Package Ctree SOTA PASSML TOPAL reticulate RecPars ARB BIOSYS-2 RAPD-PCR package TreeCons
r r r r r r r r r r r r r r r r r r r r r r r r r r r r
DIVERSI DISTANCE Darwin sendbs partimatrix BAMBE nneighbor unrooted ROSE weighbor PhyloQuart QR2 VeryfastDNAml LARD puzzleboot Willson quartets programs POY RIND TipDate RRTree Fels-Rand PAL Mavric dnarates CLINCH UO Arlequin vCEBL
Fitch programs
r
Phylo_win
r
ODEN
r
TreeTree
r
Wisconsin Sequence Analysis Package (GCG) SeqPup Lintre
r
r r r r r r r r r r r r r r r r r r r r r r q
r r
RSVP
r
Microsat
r
OSA
r
TREE-PUZZLE
r
AMP
r
fastDNAml
r
MOLPHY
r
PAML
r
SplitsTree
r
PLATO
r
SPOT
r
STATGEOM
r
PHYLTEST
r
PARBOOT
r
TreeAlign
r
ClustalW
r
MALIGN
r
GeneDoc
r
COMPARE
r
TheSiminator
PC's
r
as Windows executables (not counting executing in a "DOS box")
http://evolution.genetics.washington.edu/phylip/software.html (12 of 18) [14/11/2000 5:28:08 pm]
Phylogeny Programs
s s s s s s s s s s s s s s s s s s s s s s r
PHYLIP PAUP* Tree Gardener TREECON GDA SeqPup MOLPHY WET GeneDoc COMPONENT TREEMAP COMPARE RAPDistance TreeView Phylodendron Molecular Analyst Fingerprinting POPGENE TFPGA Ctree GeneTree MVSP RSTCALC PHYLIP
s
s s s s s s s s s s s s s s s s s s s s s s
Genetix NJplot unrooted Arlequin DAMBE DnaSP PAML LVB DNASIS minspnet BioEdit ProSeq RRTree Fels-Rand PAL WINCLADA SECANT Nona DNASEP SEPAL Phylogenetic Independence vCEBL ANCML REAP MVSP Lintre BIOSYS-2 RAPD-PCR package DIVERSI T-REX sendbs K2WuLi homoplasy test
under DOS (MSDOS, PCDOS) or in a Windows "DOS box"
s s s s s s s s s s s s
PAUP*
s
RAPDistance
s
MEGA
s
DIPLOMO
s
Fitch programs
s
TREE-PUZZLE
s
Hennig86
s
ABLE
s
MEGA
s
ClustalW
s
RA
s
MALIGN
s
Nona
s
GeneDoc
s
TurboTree
s
COMPARE
s
Freqpars
s
CMAP
s
Fitch programs
Random Cladistics
s
http://evolution.genetics.washington.edu/phylip/software.html (13 of 18) [14/11/2000 5:28:08 pm]
Phylogeny Programs
s s s s s s s s q
TREECON Microsat DISPAN RESTSITE NTSYSpc METREE Hadtree, Prepare and Trees PHYLTEST TheSiminator
s s s s s s s s
CoSta njbafd GEOMETRY PDAP PICA95 REDCON TAXEQ2 BIONJ
s s s s s s s s
weighbor POY TreeDis QUARTET2 Network CLINCH Gambit MEAWILK
Macintosh or PowerMac executables
r r r r r r r r r r r r r r r r r r r r r r r
PHYLIP PAUP* CAFCA MacT TreeTree SeqPup Microsat TREE-PUZZLE fastDNAml MacClade Spectrum SplitsTree PLATO SPOT AutoDecay RASA ClustalW TREEMAP CAIC COMPARE PA Bi-De SEQEVOLVE
r r r r r r r r r r r r r r r r r r r r r r r
r
Seq-Gen
r
T-REX unrooted GeneStrut COMPONENT Lite weighbor Modeltest PAML LARD MATRIX Willson quartets programs ALIGN CodonBootstrap DNASIS TipDate RRTree MacroCAIC Fels-Rand PAL RadCon TreeEdit Arlequin vCEBL
End-Epi
r
StratCon
r
CONSERVE
r
TreeView
r
NJplot
r
DendroMaker
r
MUST
r
DNA Stacks
r
Phylogenetic Investigator
r
Tree Draw Deck
r
Phylodendron
r
TreeRot
r
Treevolve and PTreevolve
r
PSeq-Gen
r
Molecular Analyst Fingerprinting
r
BIONJ
r
GCUA
r
ACAP
r
GeneTree
r
QDate
r
LVB
q
VMS executables or C sources with VMS compilation support. (Many of the programs listed under Unix above have C source code which can also be compiled under VMS).
http://evolution.genetics.washington.edu/phylip/software.html (14 of 18) [14/11/2000 5:28:09 pm]
Phylogeny Programs
r r r r r r r
PHYLIP Wisconsin Sequence Analysis Package (GCG) MARKOV TREE-PUZZLE fastDNAml TreeAlign ClustalW
Analyzing particular types of data
Here you will find lists of programs that analyze types of data other than molecular sequence data. We will gradually expand this list of data types. Microsatellite data r RSTCALC
r r
njbafd Microsat
RAPDs, RFLPs, or AFLPs r tfpga
r r r
RAPD-PCR RAPDistance Molecular Analyst Fingerprinting
Continuous quantitative characters (under construction: coming soon)
Recent listings
Here are the packages that have most recently been added to these listings: (the most recent ones first). Entries are retained in this list for about 6 months. q vCEBL (3 November 2000)
q q q q q q
MEAWILK (2 November 2000) UO (20 April 2000) Gambit (18 April 2000) Network (13 April 2000) TreeEdit (5 April 2000) dnarates (31 March 2000)
http://evolution.genetics.washington.edu/phylip/software.html (15 of 18) [14/11/2000 5:28:09 pm]
Phylogeny Programs
q q q q q
Phylogenetic Independence (29 March 2000) DNASEP (18 March 2000) SEPAL (18 March 2000) Mavric (6 March 2000) RadCon (26 February 2000)
Recent changes
Here are the packages whose entries have most recently been changed: The date on which each change was entered is shown. Entries are retained in this list for about 6 months. (Note that changes may be as small as updated version numbers). The most recent changes are first. q Random Cladistics (2 November 2000)
q q q q q q q q q q q q q q q q q q q q q
Arlequin (24 April 2000) puzzleboot (23 April 2000) TREE-PUZZLE (23 April 2000) RASA (23 April 2000) ClustalW (18 April 2000) CLINCH (17 April 2000) Fels-Rand (16 April 2000) PAL (14 April 2000) TurboTree (13 April 2000) Hadtree, Prepare and Trees (13 April 2000) weighbor (13 April 2000) ProSeq (13 April 2000) Microsat (11 April 2000) njbafd (4 April 2000) Lintre (4 April 2000) sendbs (4 April 2000) gmaes (4 April 2000) Nona (15 March 2000) SECANT (15 March 2000) TipDate (10 March 2000) NTSYSpc (28 February 2000)
http://evolution.genetics.washington.edu/phylip/software.html (16 of 18) [14/11/2000 5:28:09 pm]
Phylogeny Programs
Other lists of phylogeny software
q q
q
q
q
q
q
q
q
q
The University of California Museum of Paleontology page of Phylogenetics Software Resources at http://www.ucmp.berkeley.edu/subway/phylo/phylosoft.html. Few programs are listed, but there is a very nice list of software lists there. The BioCatalog phylogeny page at the European Bioinformatics Institute, located at http://corba.ebi.ac.uk/Biocatalog/Phylogeny.html The Institut Pasteur in Paris has the Bio NetBook, a search facility for biocomputing resources. It is located at http://www.pasteur.fr/recherche/BNB/bnb-en.html. Programs for phylogenies can be found by, for example, selecting software from the Resource Type list and evolution from the Biological Domain list without selecting any Organism. A brief list of programs at the Willi Hennig Society's home pages. It reflects a rather different worldview, centered on the parsimony method. Classification and clustering programs available for free by network are described in a useful Web page from the Classification Society of North America at http://www.pitt.edu/~csna/software.html. Note, however, that inferring phylogenies and making clusters are different tasks; the software described on that list will be of most use to people who are trying to cluster or classify but not to infer phylogenies. Genamics, a company located in Hamilton, New Zealand, maintains the SoftwareSeek searchable index of bioinformatics software at http://genamics.com/software/index.htm in a number of categories. One of them is Phylogenetic Analysis. They have a reasonably large number of entries under that heading, though it also includes some statistical genetics software that is really not phylogenetic. Their listing has links to the web sites of the software; for those programs that are not available by Web they maintain copies for download at their server. David Robertson of the Department of Zoology, University of Oxford has a very informative web site at http://grinch.zoo.ox.ac.uk/RAP_links.html listing programs and their web sites that test for the presence of recombination or hybridization events in DNA sequence data. It lists some programs that are covered here, and others that are outside the scope of these web pages. Georg Fuellen at the University of Bielefeld, Germany, has a very good page on Multiple Alignment Resources at http://www.techfak.uni-bielefeld.de/bcd/Curric/MulAli/welcome.html. Don Gilbert, of the Department of Biology of the University of Indiana, has a good web page on Free Software in Molecular Biology for Macintosh and MS Windows computers at http://iubio.bio.indiana.edu/soft/molbio/Listings.html. It lists some popular packages and all packages and programs kept at the IUBio ftp server (see our description of that server). Unfortunately the web links on that page are not active so the addresses must be retyped by hand. Andrea Hansen, of the Universität Braunschweig, Germany, has created the bioinformatik.de index of resources. It includes a list of software located at http://www.bioinformatik.de/cgi-bin/browse/Catalog/Software. The phylogeny programs listings there are located within the categories for different operating systems.
http://evolution.genetics.washington.edu/phylip/software.html (17 of 18) [14/11/2000 5:28:09 pm]
Phylogeny Programs
q
q
q
The National Biotechnology Information Facility has a list of phylogeny programs (including some population genetics programs as well) at http://www.nbif.org/software/software.html#phylogenetic_analysis. The list of phylogeny software compiled by David Maddison and Wayne Maddison as part of their "Tree of Life" project on the World Wide Web. Its URL is: http://phylogeny.arizona.edu/tree/programs/programs.html. This list has not been updated in a while. Dan Jacobson posted an extensive list of biological software and database sites at http://www.bis.med.jhmi.edu/Dan/software/biol-links.html. It has not, as far as I know, been updated in a while.
To first page of main software listing
... to the PHYLIP home page
http://evolution.genetics.washington.edu/phylip/software.html (18 of 18) [14/11/2000 5:28:09 pm]
Phylogeny Programs (contnued)
To go to top of Software page To previous part of Software page
PHYLIP version 3.5c is the package described in this Web site. It is available free, from our Web site, in C source code, or as executables for pre-386 DOS, 386/486/Pentium DOS, Windows 3.1, Windows95/98/NT, 68k Macintosh, or PowerMac. The C source code is easily compiled on Unix systems, and VMS compilation support is also available in the package. It includes programs to carry out parsimony, distance matrix methods, maximum likelihood, and other methods on a variety of types of data, including DNA and RNA sequences, protein sequences, restriction sites, 0/1 discrete characters data, gene frequencies, continuous characters and distance matrices. It is the most widely-distributed phylogeny package, with over 6,000 registered users, some of them satisfied. It competes with PAUP* to be the program responsible for the most published trees. It has been distributed since October, 1980. PHYLIP is distributed at the PHYLIP web site at http://evolution.genetics.washington.edu, or by anonymous ftp from evolution.genetics.washington.edu in directory pub/phylip.
David Swofford of the Laboratory of Molecular Systematics, National Museum of Natural History, Smithsonian Instition, Washington, D.C. has written PAUP* (which originally meant Phylogenetic Analysis Using Parsimony). PAUP* version 4.0beta has been released as a provisional version by Sinauer Associates, of Sunderland, Massachusetts. It has Macintosh, PowerMac, Windows, and Unix/OpenVMS versions. PAUP* is the most sophisticated parsimony program, with many options and close compatibility with MacClade. It has become much broader with the inclusion of more methods. It includes parsimony, distance matrix, invariants, and maximum likelihood methods and many indices and statistical tests. Version It is described in a web page at http://www.sinauer.com/Titles/frswofford.htm, and in more detail at its web site at the LMS at http://www.lms.si.edu/PAUP/about.html. It is available for the following types of systems: q For PowerMacs and 68k Macintoshes in a version with full mouse-windows user interface, q For Windows95, Windows98, or WindowsNT in a version with a character-based command-line interface (which appears in a Windows window), q For DOS or a Windows (3.1, 95, 98 or NT) DOS box in a version which has command-line interface, and q In a Unix/VMS version, with command-line interface, for Alpha Compaq/Digital Unix, Alpha Linux, PowerPC Linux, Intel-compatible Linux, Sun SPARC/UltraSPARC Solaris, and Alpha VMS1. The price is $100 US for the Macintosh and PowerMac executable versions, $85 for the Windows executable version, and $150 for the Unix source code version, plus $20 for shipment. The Beta version comes without the manual for the moment, but with a promised upgrade that will provide the manual and the completed version of the program when those are available. Their ISBN numbers are 0-87893-805-2, -806-0, -807-9, and -808-7. Orders can be placed with Sinauer through their orders web page at
http://evolution.genetics.washington.edu/phylip/software.pars.html (1 of 8) [14/11/2000 5:28:14 pm]
Phylogeny Programs (contnued)
http://www.sinauer.com/formpurch.htm, by e-mail at orders@sinauer.com, by telephoning Sinauer Associates at (413) 549-4300, by fax at (413) 549-1118, or by mail at: Sinauer Associates, Inc., 23 Plumtree Road, Sunderland, MA 01375-0407. The international distributor is W. H. Freeman at Macmillan Press, Brunel Road, Houndsmills, Basingstoke, Hampshire RG21 6XS, U.K. Tel: +44-1256-3302699 Fax: +44-1256-364733. Their e-mail address is mdl@macmillan.co.uk.
If you have a Macintosh computer and any interest in discrete-state parsimony methods (including DNA and protein parsimony), you should definitely get MacClade. It was written by Wayne Maddison and David Maddison of the University of Arizona. All distribution is by Sinauer Associates, 23 Plumtree Road, Sunderland, Massachusetts 01375-0407, USA. The Sinauer Associates web page for MacClade is at http://www.webcom.com/~sinauer/system.shtml#Maddison. Sinauer's phone number is: (413) 549-4300 and their fax number is (413) 549 1118. A disk with program, help file, and example data files, plus book (which has about 100 pages of intro to phylogenetic theory, and 250 pages of program instructions), is $100 U.S. ($40 for the book alone). Site licenses are also available. MacClade is described on its Web page, at http://phylogeny.arizona.edu/macclade/macclade.html. A demonstration version of MacClade 3 is also available there. MacClade enables you to use the mouse-window interface to specify and rearrange phylogenies by hand, and watch the number of character steps and the distribution of states of a given character on the tree change as you do so. An earlier and less capable Version, 2.1 (which for example cannot read nucleic acid sequences and has fewer features for discrete characters) is also available by anonymous ftp from the EMBL and Indiana molecular biology software servers at (respectively) ftp.bio.indiana.edu, and ftp.ebi.ac.uk, in directories molbio/mac and pub/software/mac, respectively, as a BinHexed and squeezed archive, macclade21.hqx.
J. S. Farris has produced Hennig86, a fast parsimony program including branch-and-bound search for most parsimonious trees and interactive tree rearrangement. Although complete benchmarks have not been published it is said to be faster than Swofford's PAUP*; both are a great many times faster than the parsimony programs in PHYLIP. The program is distributed in executable object code only and costs $50, plus $5 mailing costs ($10 outside of of the U.S.). The user's name should be stated, as copies are personalized as a copy- protection measure. It is distributed by Arnold Kluge, Amphibians and Reptiles, Museum of Zoology, University of Michigan, Ann Arbor, Michigan 48109-1079, U.S.A. (akluge@umich.edu) and by Diana Lipscomb at George Washington University (biodl@gwuvm.gwu.edu). It runs on PC-compatible microcomputers with at least 512K of RAM and needs no math coprocessor or graphics monitor. It can handle up to 180 taxa and 999 characters. It was described in the paper: Farris, J.S. 1989, Hennig86: a PC-DOS program for phylogenetic analysis. Cladistics 5: 163.
Mark Siddall, Assistant Curator of Annelida at the American Museum of Natural History, New
http://evolution.genetics.washington.edu/phylip/software.pars.html (2 of 8) [14/11/2000 5:28:14 pm]
Phylogeny Programs (contnued)
York (siddall@amnh.org) has released Random Cladistics, version 4.0.3, a set of programs that can carry out bootstrapping, jackknifing, a variety of kinds of permutation tests, and search for "islands" of trees, using Hennig86 or NONA to analyze the data. It can also mark ranges of sites for inclusion or exclusion, compare trees from the analyses, and do many other operations. To use it you must have a copy of Hennig86 (for whose distribution see above). Random Cladistics will carry out the appropriate transformations of your data and will call Hennig86 and have it analyze them, and then it will summarize the results. Random Cladistics is described by its author as no longer being supported software -- he says that "Winclada is far superior and provide's a nice interface." Random Cladistics is distributed by its author from its web site at http://research.amnh.org/~siddall/rc.html as DOS executables. Some other programs (ArNo and HardArn) are alaos distributed from that web site. ArNo computes Farris's length incongruence difference between data sets.
Tiago Ramos of the Museu de Zoologia, Universidade de Sao Paulo, Sao Paulo, Brazil (tcramos@ibm.net) has developed Tree Gardener version 2.2.1, a shell to run Hennig86 interactively on Windows systems. The program allows the user to edit data files, submit jobs, including successive weighting runs, rerooting, and consensus trees. It displays the resulting trees and allows the user to edit them. It is freely available provided that the user has a registered copy of Hennig86. Tree Gardener is available from the Digital Taxonomy web site at http://www.geocities.com/RainForest/Vines/8695/software.html#Cladistics.
Torsten Eriksson of the Bergius Foundation of the Swedish Aacademy of Sciences, Stockholm, (torsten@bergianska.se) has written a program, AutoDecay which generates Decay Indices from an existing PAUP* 4.0 treefile. It is intended to simplify the the task of creating reverse constraint trees in PAUP* 4.0 and subsequent generation of Bremer support values. (Bremer, K. 1994. Cladistics 10: 295-304). AutoDecay version 3.0 is available for PowerMac or 68k Macintoshes, in standalone versions that include the Autodecay Hypercard stack plus the Hypercard runtime engine. It is also available as a smaller Hypercard stack which requires that you have Hypercard or Hypercard Player. An older C program compiled for the Macintosh also is available, which may not work with recent versions of PAUP*. Autodecay can be obtained by World Wide Web from http://www.bergianska.se/personal/TorstenE/.
Doug Eernisse of the California State University, Fullerton (DEernisse@fullerton.edu) has constructed DNA Stacks version 1.2, a Macintosh HyperCard stack that can carry out a variety of analyses on DNA sequences. It does not do phylogenies itself. It has an alignment editor, and can carry out various kinds of translation, and codon bias analysis. It can write out data sets in PAUP*, Hennig86, and PHYLIP formats. It is included here because in its "Support Index Blocks..." menu item it is able to prepare jobs for PAUP* to enable Decay Index (Support Index) analysis. It is available by World Wide Web from http://biology.fullerton.edu/deernisse/dnastacks.html.
http://evolution.genetics.washington.edu/phylip/software.pars.html (3 of 8) [14/11/2000 5:28:14 pm]
Phylogeny Programs (contnued)
Michael Sorenson of the Department of Biology, Boston University (msoren@bu.edu) has released TreeRot, version 2a, a Macintosh program that helps make Bremer Support Indices ("decay indices") for parsimony analyses. It generates a PAUP* command file with a constraint statement for each node in a given shortest or strict consensus tree and with commands to search for trees inconsistent with each of these constraint statements in turn. For nodes with decay indices of more than a few steps, the constraint statement approach is much more effective than simply finding all trees 1, 2, 3, 4, etc. steps longer than the shortest tree and then examining their strict consensus for which nodes are lost. This version also supports the determination of partitioned Bremer support indices introduced by Baker, R.H., and R. DeSalle. 1997. Multiple sources of character information and the phylogeny of Hawaiian Drosophilids. Systematic Biology 46: 654-673, and it will also parse the PAUP* log file, automatically calculating the decay index for each node. A PowerMac executable and documentation is available at its web site at http://mightyduck.bu.edu/TreeRot. A Macintosh executable of the earlier version is also available by anonymous ftp from ftp.vims.edu in directory pub/hennig as file TreeRot.sea.bin.
James Lyons-Weiler of the Institute of Molecular and Evolutionary Genetics, Pennsylvania State University, (JFL8@psu.edu) has released RASA, version 2.4, software for Macintoshes that will perform "Relative Apparent Synapomorphy Analysis", a test for the presence of phylogenetic signal in any type of discrete character data matrix (morphological or molecular). The RASA program carries out the test and plots the results. RASA is menu-driven. The test compares the observed and null rates of increase in cladistic similarity among pairs of taxa predicted by an increase in the phenetic similarity among taxon pairs. The test is described in a paper: Lyons-Weiler, J., G.A. Hoelzer, and R.J. Tausch. 1996. Relative Apparent Synapomorphy Analysis (RASA) I: the statistical measurement of phylogenetic signal. Molecular Biology and Evolution 13: 749-757, the taxon variance plot tool in RASA was described in the paper: Lyons-Weiler, J., and G.A. Hoelzer. 1997. Escaping from the Felsenstein Zone by detecting long branches in phylogenetic data. Molecular Phylogenetics and Evolution 8: 375-384, and outgroup selection issues were discussed in Lyons-Weiler, J., G. A. Hoelzer and R. J. Tausch. 1998. Optimal outgroup analysis. Biological Journal of the Linnean Society 64: 493-511. The programs are available by World Wide Web at http://test1.bio.psu.edu/LW/rasatext.html a binhexed self-extracting archive, and version 2.2 by anonymous ftp at loco.biology.unr.edu in directory pub/rasa. J. S. Farris has recently released RA (Rapid nucleotide Analysis). It features rapid bootstrapping. It is available from Arnold Kluge, Amphibians and Reptiles, Museum of Zoology, University of Michigan, Ann Arbor, Michigan 48109-1079, U.S.A. (akluge@umich.edu) and Diana Lipscomb at George Washington University (BIODL@gwuvm.gwu.edu) who may be contacted for details. The cost is said to be about $30 US. Kevin Nixon of the L. H. Bailey Hortorium at Cornell University in Ithaca, New York (kcn2@cornell.edu) has written WINCLADA version 0.9.98, an interactive program that can read
http://evolution.genetics.washington.edu/phylip/software.pars.html (4 of 8) [14/11/2000 5:28:14 pm]
Phylogeny Programs (contnued)
and edit trees and data files, display character state changes inferred by parsimony on diagrams of the trees, and launch runs of the programs NONA, PIWE, and Hennig86. WINCLADA is available as a Windows95/98/NT executable from its web site at http://www.cladistics.com/about_winc.htm. It is available on a shareware basis: the user who downloads it must pay $50 to Kevin Nixon at Winclada/Kevin C. Nixon, 2210 Ellis Hollow Road, Ithaca, New York 14850. WINCLADA supersedes and combines features of Nixon's earlier programs ClaDOS and DADA, which are no longer distributed. MEGA (Molecular Evolutionary Genetic Analysis) has been released at the by Sudhir Kumar, Koichiro Tamura, and Masatoshi Nei of the Institute of Molecular Evolutionary Genetics, 328 Mueller Lab, Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A. It is an executable program for DOS machines, and is menu-driven with context-sensitive help. It also runs under Windows in a DOS Window. It analyzes data from DNA, RNA and protein sequences, and distance matrices produced from other kinds of data as well. It includes the Neighbor-Joining method distance matrix method, a branch and bound parsimony method, and bootstrapping. It also plots trees on many kinds of printers. The program costs $15 (for the documentation). Inquiries can also be made by mail to Joyce White at the above address or by electronic mail to jlw7@psu.edu. The MEGA manual is also available on-line in HTML at http://evolgen.biol.metro-u.ac.jp/MEGA/manual/default.html.
Xuhua Xia of the Department of Ecology and Biodiversity of the University of Hong Kong (xxia@hkusua.hku.hk) has released DAMBE (Data Analysis in Molecular Biology and Evolution), version 3.7.29, a general-purpose package for DNA and protein sequence phylogenies. It can read and convert a number of file formats, and has many features for descriptive statistics. It can compute a number of commonly-used distance matrix measures and infer phylogenies by parsimony, distance, or likelihood methods, including bootstrapping and jackknifing. There are a number of kinds of statistical tests of trees available. It can also display phylogenies. DAMBE includes a copy of ClustalW; there is also code from PHYLIP. An interesting feature is a simple web browser that allows sequences to be fetched over the web while running DAMBE. DAMBE consists of Windows95 executables. It is available from its web site at http://web.hku.hk/~xxia/software/software.htm.
Alexei Drummond and Korbinian Strimmer, respectively of the School of Biological Sciences, University of Auckland, Auckland, New Zealand (a.drummond@auckland.ac.nz), and the Department of Zoology, University of Oxford, Oxford, U.K. (korbinian.strimmer@zoo.ox.ac.uk), have released PAL (Phylogenetic Analysis Library), version 0.9, a free collection of Java classes for use in molecular phylogenetics. It is intended to facilitate the rapid construction of both general applications as well as special-purpose tools for phylogenetic analysis. PAL focuses on probabilistic data modelling and provides, e.g., routines for maximum-likelihood and least squares analysis and probability models for nucleotide/amino acid substitution, including constraints for a molecular clock. In addition, PAL also includes a number of Java applications (currently using character-based interface) based on the modules in the library:
http://evolution.genetics.washington.edu/phylip/software.pars.html (5 of 8) [14/11/2000 5:28:14 pm]
Phylogeny Programs (contnued)
MLDIST, which computes distance matrices by maximum likelihood for a number of models of substitution, q MLTREE, which computes maximum likelihood phylogenies. q EVOLVE which generates artifical data sets q JUMBLE which randomizes sequences order in data sets q LSTREE which determines least-square branch lengths q UPGMA which computes an UPGMA tree plus a number of utilities that do other tasks such as reroot trees, strip out sites, etc. PAL contains modules to read and write trees and alignments, define and use stochastic models of substitution, adjust for rate variation, and perform statistical tests, among many other things. It is available at its web site at http://users.ox.ac.uk/~strimmer/pal/. Two user interfaces are available (links to them are at the PAL site): q Vanilla (by Strimmer): A simple bare bones text frontend q Pebble (by Drummond): A GUI interface to PAL plus a functional command language
q
John Czelusniak of the Department of Anatomy and Cell Biology, Wayne State University, Detroit, Michigan (jc@tree.roc.wayne.edu) has written sog, a C program demonstrating an algorithm to find the most parsimonious phylogeny along with the parsimony strength of grouping (or Bremer decay index) for nucleotide sequences in one pass of a branch and bound algorithm. This differs from the implementation in PAUP* which uses a separate branch and bound search to find the strength of grouping for each group in the tree, using the tree group exclusion option. John says that "sog is a rather ugly hack which will be optimized and streamlined. It IS ALPHA SOFTWARE, which means it has not been tested extensively on datasets other than our primate datasets." It is available by anonymous ftp from ftp.bio.indiana.edu in directory molbio/evolve. It is distributed as C source code. It will compile and run on Linux and Nextstep for Intel systems. Naoko Takezaki (ntakezak@lab.nig.ac.jp) of the Center for Information Biology, National Institute of Genetics, Mishima, Japan, has written gmaes, a program that estimates a gamma distribution parameter for rate variation among sites by counting the minimum number of substitutions at each site for a given tree topology. The program runs on the Sun workstations. It is distributed as C source code for Unix along with a neighbor-joining program njboot from her web site at http://cib.nig.ac.jp/dda/ntakezak.html#gmaes, and by ftp from the IUBIO archive (ftp.bio.indiana.edu) in directory molbio/evolve. Daniel Barker (sokal@holyrood.ed.ac.uk) of the Institute of Cell and Molecular Biology of the University of Edinburgh, Scotland, has written LVB version 1.0, a program for inferring phylogenies using parsimony and simulated annealing. Simulated annealing is intended to allow searches for most parsimonious trees with large numbers of species. It is described as often giving good results with large matrices. Up to 16383 objects and 32766 characters may be used. Aligned nucleotide sequences and/or discrete morphological characters can be used. The program is currently available in ANSI C source code
http://evolution.genetics.washington.edu/phylip/software.pars.html (6 of 8) [14/11/2000 5:28:14 pm]
Phylogeny Programs (contnued)
as a Unix tar file, and as executables for PowerMac, for Windows95, for Windows 3.1, and for OS/2. The text of a manual can also be downloaded from the web site. LVB is available from its Web site at http://www.icmb.ed.ac.uk/sokal.html. It is also available as a Web server from the Institut Pasteur.
Rod Page (dpage@udcf.gla.ac.uk), of the Division of Environmental and Evolutionary Biology of the University of Glasgow has released GeneTree, a program that produces "reconciled trees" that fit a tree of gene copies to a species tree. It uses a parsimony criterion where the penalty is the number of deletions and duplications required to reconcile the gene tree with the species tree. The program is described as "preliminary". The algorithm is described in the paper: Page, R. D. M. and M. A. Charleston. 1997. From gene to organismal phylogeny: Reconciled trees and the gene tree/species tree problem. Molecular Phylogenetics and Evolution 7: 231-240. It is available as a PowerMac executable and as an executable for Windows 95 or Windows NT. They are available from the GeneTree web site at http://taxonomy.zoology.gla.ac.uk/rod/genetree/genetree.html. A manual is also available online there.
John Huelsenbeck (johnh@brahms.biology.rochester.edu) of the Department of Biology of the University of Rochester, in Rochester. New York has released CodonBootstrap version 1.0. This is a utility that will generate non-parametric bootstrap data sets from a DNA sequence file. The program re-samples codons to (1) avoid problems when analysing data under models that assume coding structure (e.g., rates partitioned by sites), or (2) when the user wishes to re-sample sites and maintain the original autocorrelation among positions within the codon. CodonBootstrap is available as a PowerMac executable from the Huelsenbeck laboratory software web site at http://johnh@brahms.biology.rochester.edu/software.html. Ben Salisbury (ben@aya.yale.edu) of the Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, has released DNASEP, which uses (with permission) some code from my PHYLIP program DNAPARS to carry out Salisbury's criterion of Strongest Evidence Parsimony. The criterion is described in a paper: Salisbury, B. A. 1999. Strongest evidence: maximum apparent phylogenetic signal as a new cladistic optimality criterion. Cladistics 15: 137-149. DNASEP is available as a Windows95 executable from Salisbury's web site at http://jkim.eeb.yale.edu/salisbur/. It has been partially superseded by a later program of Salisbury's, SEPAL, which has more functions.
Ben Salisbury (ben@aya.yale.edu) of the Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, has written SEPAL, version 1.01, which can search for the trees that maximize the Strongest Evidence criterion, Apparent Phylogenetic Signal. It can also do
http://evolution.genetics.washington.edu/phylip/software.pars.html (7 of 8) [14/11/2000 5:28:14 pm]
Phylogeny Programs (contnued)
Iterative SE, parsimony, and parsimony jackknifing. It can also calculate decay values (Bremer supports) for either parsimony or Strongest Evidence. It also has some options for removing characters that are particularly noisy. The criterion is described in a paper: Salisbury, B. A. 1999. Strongest evidence: maximum apparent phylogenetic signal as a new cladistic optimality criterion. Cladistics 15: 137-149. SEPAL is available as a Windows95 executable from Salisbury's web site at http://jkim.eeb.yale.edu/salisbur/. To next section of software page
http://evolution.genetics.washington.edu/phylip/software.pars.html (8 of 8) [14/11/2000 5:28:14 pm]
Phylogeny Programs (continued)
To go to top of Software page To previous part of Software page Jun Adachi and Masami Hasegawa have written a package MOLPHY 2.2, carrying out maximum likelihood inference of phylogenies for either nucleotide sequences or protein sequences. Their protein sequence maximum likelihood program, ProtML, is a successor to the one they made available to me for distribution on a nonsupported basis in PHYLIP, and is much improved over that. It is one of two protein maximum likelihood programs available. The package is distributed free in C source code, with documentation, by ftp from sunmh.ism.ac.jp. An executable version for Windows95 or Windows NT on Intel processors, and also one that works on Windows NT on DEC Alpha processors, is available from Russell Malmberg at the Botany Department of the University of Georgia (russell@dogwood.botany.uga.edu) by World Wide Web at http://dogwood.botany.uga.edu/malmberg/software.html Gary Olsen, of the Department of Microbiology, University of Illinois, Urbana, Illinois (gary@phylo.life.uiuc.edu) has developed a speeded-up replacement for my program DNAML coded in C, called fastDNAml. It achieves a number of economies and also is organized so that it can be run on parallel processors -- he and his co-workers have constructed trees of very large size on a high-speed parallel processor. The program can be compiled using the "p4" portable parallel processing toolkit. It can also be run in ordinary serial mode on workstations where it is faster than DNAML. q The C program is available from the Ribosomal Database Project by Web from http://www.cme.msu.edu/RDP/cgis/aftpdir_show.cgi?ftpdir=pub/RDP/programs/fastDNAml&title=Phylogenetic%20tree%20inference%20(fastDNAml;%20C)&showdir=yes q It is also available by ftp from rdp.life.uiuc.edu in directory pub/RDP/programs/fastDNAml.
q q
The C program and PowerMac executables are also available by anonymous ftp from the Indiana University Biology ftp server at ftp.bio.indiana.edu in directory molbio/evolve. A Debian Linux executable package for fastDNAml has been made available by Stephane Bortzmeyer at the Institut Pasteur in Paris. It is available through its web page at http://www.debian.org/Packages/unstable/misc/fastdnaml.html.
Denis Beaumont (beaumont@transpac.atlas.fr) has made a parallelized version of fastDNAml called VeryfastDNAml. It is parallelized with the TreadMarks distributed shared memory system, which is a not-quite-free environment for parallelization that runs on many workstation-class machines. The C source code of VeryfastDNAml is available by ftp from the Institut Pasteur server ftp.pasteur.fr in directory /pub/GenSoft/unix/evolution/FastDNAml as file fastDNAml-tmk.tar.gz. There is a web page access to this ftp distribution at http://bioweb.pasteur.fr/seqanal/soft-pasteur.html#veryfastdnaml, which includes a link to the TreadMarks project. Ziheng Yang of the Department of Genetics and Biometry, University College London, (z.yang@ucl.ac.uk) has released PAML, version 2.0g, a package of programs for the maximum likelihood analysis of nucleotide or protein sequences, including codon-based methods that take into account both amino acids and nucleotides. The programs can estimate branch lengths in a phylogenetic tree and parameters in the evolutionary model such as the transition/transversion rate ratio, the gamma parameter for variable substitution rates among sites, rate parameters for different genes, and synonymous and nonsynonymous substitution rates. They can also test evolutionary models, calculate substitution rates at particular sites, reconstruct ancestral nucleotide or amino acid sequences, simulate DNA and protein sequence evolution, compute distances based on the synonymous and nonsynonymous changes, and of course do phylogenetic tree reconstruction by maximum likelihood and Bayesian Markov Chain Monte Carlo methods. The strength of the package lies in its rich implementation of evolutionary models, though Yang coments that tree-making is not a strong point of the current version. The autocorrelated gamma distribution implementation is analogous to the Hidden Markov Model scheme available in PHYLIP. The package is available as ANSI C source code for Unix systems, as PowerMac executables and as executables that run on Windows95, Windows98, and WindowsNT. See the PAML web page at http://abacus.gene.ucl.ac.uk/ziheng/paml.html. It can be downloaded by ftp from abacus.gene.ucl.ac.uk in directory pub/paml. Bret Larget and Donald Simon of the Department of Mathematics and Computer Science, Duquesne University, Pittsburgh, Pennsylvania (bambe@mathcs.duq.edu) have written BAMBE (Bayesian Analysis in Molecular Biology and Evolution) version 2.01 beta, a program for Bayesian analysis of phylogenies with DNA sequence data. It uses a prior distribution of trees and arearrangement mechanism introduced in the paper: Mau, B., M. A. Newton, and B. Larget. 1997. Bayesian phylogenetic inference via Markov chain Monte Carlo methods. Molecular Biology and Evolution 14: 717-724. The trees and parameter values are sampled by a Metropolis algorithm Markov Chain Monte Carlo sampling. The resulting posterior distribution can be used to characterize the uncertainty about not only the tree, but the parameters of the substitution model as well. The program is in C++ source code for Unix, and is distributed from its web page at http://www.mathcs.duq.edu/larget/bambe.html.
David Posada and Keith Crandall at the Department of Biology, Brigham Young University (dp47@email.byu.edu) has released Modeltest version 3.0, a program to test a hierarchy of statistical models of DNA evolution using the Likelihood Ratio Test criterion and the AIC (Akaike Information Criterion). The likelihood values are obtained by running PAUP*. MODELTEST accepts likelihood scores corresponding to 56 models of DNA substitution including whether transition and transversion rates are equal, whether rates at different sites are equal, and whether there are invariant sites. Modeltest is described in the paper: Posada, D. and K. A. Crandall. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14: 817-818. It is available as executables for Macintosh and PowerMac, for Windows95/98/NT, for Linux and for Suns, and in source code as C for the Metrowerks C compiler. It is distributed from its web site at http://bioag.byu.edu/zoology/crandall_lab/modeltest.htm.
Nick Grassly, currently of the Zoologisches Institut, Universität München (grassly@zi.biologie.uni-muenchen.de), has written PLATO, version 2.11, a program that takes sequential PHYLIP-style DNA sequences followed by their maximum likelihood phylogeny, and using a likelihood approach with sliding window analysis and Monte Carlo simulation of the null distribution detects anomalously evolving regions in the DNA sequences and assesses their significance. This may lead to the detection of, for example, recombination, gene conversion or convergence, or reveal variable selective pressures along the gene sequence. A general substitution model is used that can allow the test to reveal differences due to recombination while ignoring those due to varying rate of evolution. The method is described in the paper: Grassly, N. C., and E. C. Holmes. 1997. A likelihood method for
http://evolution.genetics.washington.edu/phylip/software.etc1.html (1 of 8) [14/11/2000 5:28:22 pm]
Phylogeny Programs (continued)
the detection of selection and recombination using sequence data. Molecular Biology and Evolution 14: 239-247. It is available for Macintoshes (including PowerMacs) or in source code for Unix systems. It requires substantial amounts off memory, especially when sequences analysed are long. Use of Power Macintoshes or UNIX systems is recommended. It is distributed free from the University of Oxford Zoology Web server at http://evolve.zoo.ox.ac.uk/Plato/Plato2.html. Mika Salminen and Wayne Cobb (msalminen@hiv.hjf.org and wcobb@reed.hjf.org), of the Henry M. Jackson Foundations for the Advancement of Military Medicine, Walter Reed Army Institite of Research, Bethesda, Maryland, have released the Bootscanning Package, version 1.0beta. This is a series of shell scripts and programs that analyze DNA sequences for evidence of recombination. It breaks the sequence into separate pieces that are analyzed for the bootstrap support of various groups, and it looks for evidence of significant conflict among trees for different parts of the sequence. The programs are currently available only as Sun executables. They require GDE 2.2a and PHYLIP version 3.4 to work. They are available by anonymous ftp from from http://www.ktl.fi in directory /hiv/mirrors/pub/programs. Gráinne McGuire and Frank Wright (grainne@bioss.sari.ac.uk and frank@bioss.sari.ac.uk) of Biomathematics and Statistics Scotland, in Dundee, have released TOPAL, which checks for evidence of past recombination events, by looking for changes in the inferred phylogenetic tree TOPology between adjacent regions of a multiple sequence ALignment. Their method detects recombinations by sliding a window along a sequence alignment, and measuring the discrepancy between the trees suggested by the first and second halves of the window, using distance matrix methods. This is described in the paper: McGuire, G., F. Wright, and M. J. Prentice. 1997. A graphical method for detecting recombination in phylogenetic data sets. Molecular Biology and Evolution 14: 1125-1131. The TOPAL program is also described in a paper: McGuire, G. and F. Wright. 1997. TOPAL: recombination detection in DNA and protein sequences. Bioinformatics 14: 219-220. TOPAL is a set of Unix Bourne shell scripts and C code, plus four programs in C from my PHYLIP package. These are available from the TOPAL web site at http://www.bioss.sari.ac.uk/~frank/Genetics/topal.html. Ingrid Jakobsen and Simon Easteal of Australian National University, Canberra, have released reticulate. (Ingrid Jakobsen is now at the Institute of Molecular Evolutionary Genetics, Pennsylvania State University, and her e-mail address is ibj1@psu.edu). It is a compatibility matrix program for DNA sequences that has features designed to test for evidence of reticulate evolution (such as recombination). The program computes and displays a pairwise compatibility matrix for all pairs of sites. It can randomize the order of sites and compute the fraction of compatible sites in a region for the randomizations, to test whether there is a pattern suggesting reticulation. The program is distributed as C source code for Unix and X Windows, though there are some limited ways of running it without X Windows. It is described in the paper: Jakobsen, I. B. and S. Easteal. 1996. A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequences. CABIOS 12: 291-295. It is available from its web site at http://jcsmr.anu.edu.au/dmm/humgen/ingrid/reticulate.htm. Kim Fisker (kfisker@daimi.aau.dk) of the Computer Science Department at Aarhus University, Denmark has released RecPars, which does a parsimony analysis of DNA sequences. It tries to find the best phylogenies for different regions of the sequences and thereby postulating a recombination event between these segments. The method is described in a paper: Hein, J. 1993. A heuristic method to reconstruct the history of sequences subject to recombination. Journal of Molecular Evolution 36: 396-406. RecPars is available as C source code for Unix. It is distributed by ftp from ftp.daimi.aau.dk in directory pub/empl/kfisker/programs/RecPars. John Maynard Smith and Noel Smith of the School of Biological Sciences of the University of Sussex (noelsmith@yahoo.com) have released programs to carry out their homoplasy test for recombination in sequences. The test is described in a paper: Maynard Smith, J. and N. H. Smith. 1998. Detecting recombination from gene trees. Molecular Biology and Evolution 15: 590-599. The programs are distributed in QBASIC for DOS and must be run using QBASIC. They are available from Maynard Smith's web site at http://www.biols.susx.ac.uk/Home/John_Maynard_Smith/.
Andrew Rambaut of the Department of Zoology, University of Oxford, England (andrew.rambaut@zoo.ox.ac.uk) has produced LARD (Likelihood Analysis of Recombination in DNA) version 2.2, a program to detect the presence of recombination in a set of sequences. LARD looks at the set of sequences to discover which are the most plausible parents of a potentially recombinant sequence, and performs a likelihood ratio test for each possible breakpoint position of whether the three-species tree differs on the two sides of the breakpoint. LARD is described as an extension of a method suggested by John Maynard Smith: Maynard Smith, J. 1992. Analysing the mosaic structure of genes. Journal of Molecular Evolution 34: 126-129. It is described in a paper: Holmes, E. C., M. Worobey, and A. Rambaut. 1999. Phylogenetic evidence for recombination in dengue virus. Molecular Biology and Evolution 16: 405-409. LARD is available as C source code and as a Macintosh executable from its web site at http://evolve.zoo.ox.ac.uk/Lard/Lard.html.
Andrew Rambaut of the Department of Zoology, University of Oxford, (andrew.rambaut@zoo.ox.ac.uk) and Nick Grassly, currently of the Zoologisches Institut, Universität München (grassly@zi.biologie.uni-muenchen.de), have written SPOT (Sequence Parameters Of Trees). SPOT is a program that will calculate the likelihood of a given tree topology for a set of aligned nucleotide sequences. For each topology, SPOT will estimate the maximum likelihood values of branch lengths and other parameters of the model of nucleotide evolution that has been chosen. Such parameters include the ratio of transitions to transversions (TS/TV ratio) and relative rates of substitution at different codon positions. Branch lengths can also be constrained to assume a molecular clock hypothesis. Multiple datasets and multiple trees can be analysed which is useful for performing Monte Carlo simulations of hypothesis (parametric bootstraps). Although SPOT does not estimate tree topology, an accompanying program, SPOTSHELL, will iterate between fastDNAml and SPOT until the maximum likelihood parameters and topology has been found (or at least something close to it). SPOT is available as C source code for Unix workstations, or as Macintosh sources and executables. It can be obtained from the SPOT Web page at http://evolve.zoo.ox.ac.uk/Spot/Spot.html. Gary Olsen of the Department of Microbiology, University of Illinois, Urbana, Illinois (gary@phylo.life.uiuc.edu) has written dnarates version 1.0. It reads a set of DNA sequences and a tree, and for that tree makes a maximum likelihood estimate of the rate of evolution at each site. This is done by taking the rate at each site as a separate parameter and maximizing the likelihood with respect to all those parameters. The program is available as generic C source code. It is based in part (with my permission) on code from my PHYLIP program DNAML. dnarates is available by ftp from the IUBIO ftp server at ftp://rdp.life.uiuc.edu/pub/RDP/programs/DNArates/.
http://evolution.genetics.washington.edu/phylip/software.etc1.html (2 of 8) [14/11/2000 5:28:22 pm]
Phylogeny Programs (continued)
Mike Charleston (mcharles@udcf.gla.ac.uk) of the Division of Environmental and Evolutionary Biology of the University of Glasgow has developed Spectrum, a program for finding bipartition spectra from phylogenetic molecular and distance data, according to the method of Hendy et al. (1994) (Hadamard transforms) for moderately sized data sets (up to 18 taxa). The program also implements a branch-and-bound search for the "closest tree" - that is, the tree whose expected spectrum is closest to the spectrum derived from the observed PowerMac, 68k Macintosh, and Windows95 or Windows NT executables are available from its Web site in the Glasgow Taxonomy web pages: http://taxonomy.zoology.gla.ac.uk/~mac/spectrum/spectrum.html. Ingrid Jakobsen, Susan Wilson, and Simon Easteal, of Australian National University, Canberra, have released partimatrix. (Ingrid Jakobsen is now at the Institute of Molecular Evolutionary Genetics, Pennsylvania State University, and her e-mail address is ibj1@psu.edu). This program computes a "partition matrix" from aligned DNA sequence data. The method finds partitions of the sequences into two groups and presents a matrix which describes the conflict and agreement among these partitions. The objective is to discover parts of the DNA sequence which imply different trees. It is described in the paper by I. B. Jakobsen, S. R. Wilson and S. Easteal. 1997. The Partition Matrix: Exploring variable phylogenetic signals along nucleotide sequence alignments. Molecular Biology and Evolution 14: 474-484. The program is distributed as C source code for Unix systems with X Windows. It is available from its web site at http://jcsmr.anu.edu.au/dmm/humgen/ingrid/partimatrix.htm . Pablo Goloboff, of INSUE - Fundación e Instituto Miguel Lillo 205, 4000 S. M. de Tucumán, Argentina, has written Nona (Noname), PiWe (Parsimony with Implied WEights), and SPA to carry out parsimony including weighted parsimony analyses. Nona searches for most parsimonious trees according to character weights defined by the user a priori. Pee-Wee calculates weights of the characters by a method introduced by Goloboff, a noniterative version of J. S. Farris's "successive weighting". It was described in Goloboff's paper in Cladistics 9: 83-91, 1993. SPA is a generalized parsimony program that allows differential weighting of changes between different states. Nona is said to be faster than other parsimony programs. A Windows 95/98/NT version of Nona which includes the functionality of Piwe and SPA is available as shareware (with a free 30-day trial period) from its web page at http://www.cladistics.com/about_nona.htm. The shareware fee of $40 should be paid to the author at the above address or to James M. Carpenter, Department of Entomology, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024. Send the money and the name in which the copies are to be registered. An earlier demo version of these programs which runs on DOS is also available. It requires the user to hit an extra key each time they execute a command. It is can be fetched by anonymous from ftp.vims.edu in directory other/hennig as file pars-pag.exe, a self-extracting DOS Zip file. Yasuo Ina of the National Institute of Agrobiological Resources, Tsukuba, Japan ha developed ODEN, a package of programs for doing distance matrix analyses on nucleotide or protein sequences. It is described in CABIOS 10: 11-12 (1994). It is available free by anonymous ftp from directory pub/unix/oden on ftp.dna.affrc.go.jp as C source code for Unix systems. A. Luettke and R. Fuchs have written MacT, a package of programs for Macintoshes that compute distances and compute Neighbor-Joining phylogenies for them. The programs work on 4 through 26 sequences, and source code in Microsoft QuickBasic is provided as well as compiled executables. The package is free and is available on the molecular biology software servers. For example, it is available on by anonymous ftp on the Indiana University IUBIO server ftp.bio.indiana.edu it will be found in directory molbio/mac. The programs are described in CABIOS 8: 591-594, 1992. Andrey A. Zharkikh, Andrey Rzhetsky, and co-workers in the Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia, have produced VOSTORG, a package of programs for alignment (both manual and automatic) and inferring phylogenies by distance methods and parsimony for molecular sequences. (Zharkikh and Rzhetsky are currently in the US; their e-mail addresses are zharkikh@myriad.com and andrey@genome2.cpmc.columbia.edu). VOSTORG runs on under DOS on PC-compatibles and includes some rather fancy graphics (for DOS). It is available from its Web page in Russia from http://molevol.bionet.nsc.ru/vs.htm. The programs are described in a paper: Zharkikh, A. A., A-Yu. Rzhetsky, P. S. Morosov, T. L. Sitnikova, and J. S. Krushkal. 1991. VOSTORG: a package of microcomputer programs for sequence analysis and construction of phylogenetic trees. Gene 101: 251-254. Walter Fitch (wfitch@uci.edu), of the Department of Ecology and Evolutionary Biology, of the University of California at Irvine, has available by anonymous ftp at daedalus.bio.uci.edu in directory pub/outgoing/evoprog about 20 programs which carry out various kinds of phylogeny estimation and related tasks. They are available in source code in FORTRAN 77, (except for a few which are in C) and also as Sun SPARC executables and as DOS executables. They include: q ANCESTOR which searches for most parsimonious trees for nucleotide or protein sequences, q EVOLVES which carries out the original Fitch-Margoliash distance matrix method, q WTDPARS, programs for weighted parsimony analysis according to the methods he has introduced in the papers by P. L. Williams and W. M. Fitch, in pages 453-470 of the Nobel Symposium on the Heirarchy of Life, edited by B. Fernholm, K. Bremer, and H. Jornval, Elsevier, 1989, and the paper by P. L. Williams and W. M. Fitch, in Advances in Enzymology, volume 183, pages 615-625, 1990. There are also many programs that convert sequences among various formats, generate all possible trees, shuffle sequences, align sequences, and do various other functions. The programs are available by anonymous ftp from daedalus.bio.uci.edu in directory pub/outgoing/evoprog. There is also TDRAW which draws a tree in Postscript. This program is in C, and is not available as a DOS executable. It is available in directory pub/outgoing/tdraw. Nicholas Galtier of the University of Lyon (galtier@biomserv.univ-lyon1.fr) has written Phylo_win, a "graphic interface" for molecular phylogenetic inference. It performs neighbor-joining, parsimony and maximum likelihood methods and can bootstrap with any of them. Many distances can be used including Jukes & Cantor, Kimura, Tajima & Nei, Galtier & Gouy (1995), LogDet for nucleotidic sequences, Poisson correction for protein sequences, Ka and Ks for codon sequences. Species and sites to include in the analysis are selected by mouse. Reconstructed trees can be drawn, edited, printed, stored, evaluated according to numerous criteria. Taxonomic species groups and sets of conserved regions can be defined by mouse in both tools and stored into sequence files, thus avoiding multiple data files. It is entirely mouse-driven. Most usual sequence file formats are read: CLUSTAL, FASTA, PHYLIP, MASE. It runs under X windows on many Unix workstations. It is described in the paper: Galtier, N., M. Gouy, and C. Gautier. 1996. SeaView and Phylo_win, two graphic tools for sequence alignment and molecular phylogeny. Computer Applications in the Biosciences 12: 543-548. It is distributed as C source code (to compile it one needs the NCBI Vibrant tool kit). It is also available as executables for SunOS, Solaris, SGI Unix, IBM RISC
http://evolution.genetics.washington.edu/phylip/software.etc1.html (3 of 8) [14/11/2000 5:28:22 pm]
Phylogeny Programs (continued)
Unix, Linux, HP/UX, and DEC Alpha (Digital Unix). It can be fetched from its web page at http://pbil.univ-lyon1.fr/software/phylowin.html. It can also be obtained by anonymous ftp from biom3.univ-lyon1.fr in directory pub/mol_phylogeny. A PC Linux executable is available at http://evolution.bmc.uu.se/~thomas/mol_linux. A Digital OpenVMS executable is also available as http://seqaxp.bio.caltech.edu:8000/pub/SOFTWARE/phylo_win_vms.zip. F. James Rohlf has written NTSYSpc (Numerical Taxonomy System, Version 2.0), a clustering program that includes calculation of various kinds of distance measures, as well as Hierarchical clustering methods such as UPGMA as well as Neighbor-Joining and consensus trees. It can also do a variety of other things including ordination, scatter diagrams, and elliptic Fourier transforms (for shape analysis). NTSYSpc 2.0 is a Windows95 executable which will also run on Windows NT. It is available for $275 ($210 for educational and government institutions). 10-user site licensese are also available. It is distrubuted by Exeter Software (the biological software company, not the warehouse-inventory-software house of the same name). Their e-mail address is sales@exetersoftware.com. Their toll-free telephone number is 800-842-5892, their not-so-free phone number is +1-631-689-7838, and their fax number is +1-631-689-0103. Their mailing address is 100 North Country Road, Setauket, NY 11733-1345 USA. FAX, or phone (toll-free telephone within the USA). Further information is available on their Web page at http://www.exetersoftware.com/cat/ntsyspc.html.
Rino Zandee (zandee@rulsfb.leidenuniv.nl), of the Institute of Evolutionary and Ecological Science, Van der Klaauw Laboratory, Leiden University, has written CAFCA version 1.5j, the Collection of APL Functions for Comparative Analysis. It carries out a search for the most parsimonious tree with discrete-character data (either two-state or multistate), using a search for cliques of component compatibility (monothetic subsets) to propose the candidates for most parsimonious trees. The program is written as functions in the APL language, but Macintosh and PowerMac executables are distributed. The program is free and is available from the CAFCA Web Site http://wwwbio.leidenuniv.nl/~zandee/cafca.html.
Korbinian Strimmer(http://users.ox.ac.uk/~strimmer) now at the Department of Zoology, University of Oxford, U.K.), and Arndt von Haeseler (haeseler@eva.mpg.de) now at the Max-Planck-Institute for Evolutionary Anthropology, Leipzig, (both previously of the Zoologisches Institut of the Universität München) have developed TREE-PUZZLE version 4.0.2, (formerly called PUZZLE) a program for maximum likelihood analysis for nucleotide and amino acid alignments. It infers phylogenies by "quartet puzzling", a method that applies maximum likelihood tree reconstruction to all possible quartets of taxa and subsequently tries to combine most of the four-taxa maximum likelihood trees to construct an overall maximum likelihood tree. Usually there are several possible solutions. A consensus tree generated from the quartet puzzling trees shows nodes that are well supported. More details about the algorithm and on the phylogenetic accuracy can be found in the papers: K. Strimmer and A. von Haeseler. 1996. Molecular Biology and Evolution 13: 964-969 and K. Strimmer, N. Goldman, and A. von Haeseler. 1997. Molecular Biology and Evolution 14: 210-211. TREE-PUZZLE supports all popular models of sequence evolution of nucleotides and proteins, and can take rate heterogeneity among sites into account. It computes pairwise maximum likelihood distances for many different models of sequence evolution (TN, HKY, F84, SH, Dayhoff, JTT, mtREV24 and BLOSUM62), and estimates parameters of the models. It can estimate maximum-likelihood branch-lengths for user-specified trees and perform likelihood ratio tests of clockness as well as Kishino-Hasegawa-Templeton tests. The program is written in ANSI C and is compatible with PHYLIP files. precompiled executables are distributed for PowerMac and for Windows 95/98/NT. For UNIX and VMS systems files for automated compilation are provided. It is available from the TREE-PUZZLE web page at http://www.tree-puzzle.de or by anonymous ftp from: q ftp.ebi.ac.uk in directory pub/software), q ftp.bio.indiana.edu in directory molbio/evolve, and q ftp.pasteur.fr in directory /pub/GenSoft. Its online manual can be viewed at http://www.tree-puzzle.de/manual.html. A Debian Linux package of TREE-PUZZLE is available at its web site at http://www.debian.org/Packages/unstable/misc/puzzle.html. Mike Holder (holder@mbl.edu) and Andrew Roger(roger@mbl.edu) of the Marine Biological Laboratory in Woods Hole, Massachusetts are distributing a shell script program for Unix systems, puzzleboot that allows the analysis of multiple data sets with TREE-PUZZLE. It is designed for use with the distance matrix option of TREE-PUZZLE, to make use of the distance calculation methods. It is available from the TREE-PUZZLE web page at http://www.tree-puzzle.de. Kay Nieselt-Struwe (kns@phy.auckland.ac.nz) of the Department of Physics of the University of Auckland, New Zealand has released version 1.0 of STATGEOM. It carries out computation of the statistical geometry in distance and in sequence space of a set of aligned DNA/RNA, amino acid or binary sequences. The user can decide to either compute the overall tree-likeness of the whole set, or a certain subset, or given a tree of the sequences compute the reliability of certain edges in the tree. Postscript files of the graphs of the statistical geometry are automatically generated. A sequence reformatting utility allow various sequence formats to be read in. STATGEOM is written in ANSI C; source code with documentation and a Sun SPARC executable are available by anonymous ftp at cage.mpibpc.gwdg.de (or 134.76.209.64) in directory pub/kniesel. The method of statistical geometry was originally published in: Eigen, M., Winkler-Oswatitsch, R. and Dress, A. 1988. Statistical geometry in sequence space: a method of comparative sequence analysis. Proc. Natl. Acad. Sci. USA 85: 5913-5917.
Rainer Wetzel and Daniel Huson of the University of Bielefeld (huson@mathematik.uni-bielefeld.de) have developed a program SplitsTree2 for SplitsTree2 carries out the split decomposition method of A. Bandelt and A. Dress, Bandelt and Dress's p-splits, and spectral analysis (Hendy, Penny, Szekely, Steel & Erdos). It can process sequence or restriction site data, and can do does bootstrapping. It also contains an implementation of Cooper, Penny and Steel's method for dating divergences and also their molecular clock test. (Molecular Phylogenetics 1: 242-252 (1992)). It is also discussed in the paper: Huson, D. H. 1998. SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics 14: 68-73. SplitsTree2 is currently available as a Mac program or in a Unix version for a number of different machines (Sun, SGI, DEC and HP). These use the program ghostview to draw the computed graph. The Mac version draws the graph in its own window, and the picture can be copied and pasted or printed in the usual way. There is no Windows version at present. It is available by ftp from ftp.uni-bielefeld.de in directory pub/math/splits. SplitsTree2 is software under development. A server is also maintained which uses SplitsTree 2 to analyze data submitted via its web page.
http://evolution.genetics.washington.edu/phylip/software.etc1.html (4 of 8) [14/11/2000 5:28:22 pm]
Phylogeny Programs (continued)
Igor Kuznetov and Pavel Morozov (kuznets@mailhost.bionet.nsk.su) of the Institute of Cytology and Genetics, Novosibirsk, Russia, have produced GEOMETRY, a package for nucleotide sequence analysis using the method of statistical geometry in sequence space (M. Eigen, R. Winkler-Oswatitsch, and A. Dress. 1988. Statistical geometry in sequence space: A method of quantitative comparative sequence analysis, Proc. Natl. Acad. Sci. USA 85: 5913-5917). The program is described in an article by Kuznetsov and Morozov in 1996 in CABIOS 12: 297-301. The package uses the same data formats for sequence and tree input as the ones used in VOSTORG package. GEOMETRY is available as a DOS executable. It is available for downloading by Web from http://molevol.bionet.nsc.ru/soft.htm or by ftp from ftp.bionet.nsk.su in directory incoming/molevol and also from the EMBL file server ftp.ebi.ac.uk in directory pub/software/dos. Vincent Berry of the Université Jean Monnet in St.-Etienne, France (vberry@univ-st-etienne.fr) has released PhyloQuart version 1.3, a package of programs inferring phylogenies from quartets. It is able to use either nucleotide sequences or distances. It implements the Q* method of tree reconstruction, which is inspired by the work of Bandelt and Dress, and is described in the forthcoming paper: Berry, V. and O. Gascuel. Inferring Evolutionary Trees with Strong Combinatorial Evidence. Theoretical Computer Science, to appear. PhyloQuart is available as C source code which can be compiled on Unix systems, from its web site at http://www.univ-st-etienne.fr/eurise/LOGICIELS/PHYLOQUART/main-pq.html. PhyloQuart is also available as a Web server from the server of the Institut Pasteur. Stephen J. Willson (swillson@iastate.edu) of the Department of Mathematics, Iowa State University, has produced a package of programs to infer phylogenies from quartets of species. They infer phylogenies of individual quartets by parsimony, and in combining them use information on how strongly the phylogeny for that quartet is preferred over its alternatives, or by measures of how well the group fits into a given placement on a tree, as judged by quartets. The methods are described in two papers: Willson, S. J. 1998. Measuring inconsistency in phylogenetic trees, Journal of Theoretical Biology 190: 15-36, and Willson, S. J. 1998. Building phylogenetic trees from quartets by using local inconsistency measures . Molecular Biology and Evolution 16: 685-693. The programs are in C and are described as having successfully been compiled on PowerMac systems using the Codewarrior C compiler. PowerMac executables are also provided. The programs are available at Willson's software web site at http://www.public.iastate.edu/~swillson/software.html. James Lake of the Department of Molecular, Cell and Developmental Biology of the University of California, Los Angeles (lake@mbi.ucla.edu) has released Gambit, which implements a method called Boostrapper's Gambit. The method involves bootstrap sampling sequences, computing trees for quartets of species, and assembling larger trees out of quartets that have significant boostrap support. One of the methods available to estimate trees from the quartets is paralinear (LogDet) distances. Other distance methods and parsimony are also available. The program is available as a DOS executable, free to noncommercial users on a trial basis until January 15, 2001. Commercial users are asked to pay $50 on a shareware basis. The program is available at its web site at http://www.lifesci.ucla.edu/mcdbio/Faculty/Lake/Research/Programs/. Arne Röhl, Peter Forster, and Hans-Jürgen Bandelt (Forster is at pf223@cus.cam.ac.uk) have written Network 2.0b, a program to infer networks (which have more connections than trees). The networks are median-joining networks, a method which is described in a paper: Bandelt, H-J., P. Forster, and A. Röhl. 1999. Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution 15: 1108-1114. The program is available as shareware (free until 1 June 2000) as a DOS executable from Fluxus Engineering at its web site at http://www.fluxus-engineering.com/sharenet.htm. George Estabrook of the Department of Biology of the University of Michigan, Ann Arbor, Michigan (gfred@umich.edu) distributes MEAWILK (MEAcham and WILKinson criteria) which uses a randomization test to evaluates support from character data for hypothesized monophyletic groups It uses criteria published by Christopher Meacham (1994. Phylogenetic relationships at the basal radiation of angiosperms: Further study by probability of character compatibility. Systematic Zoology 19: 506-522) by Mark Wilkinson (1998. Split support and split conflict randomization tests in phylogenetic inference. Systematic Biology 47: 673-695). MEAWILK is a DOS executable distributed from Estabrook's programs web site at http://www-personal.umich.edu/~gfred/. Pierre Rioux and Tim Littlejohn of the Informatics Division of the Organelle Genome Megasequencing Program at the Universite de Montreal has made Available PARBOOT, a program that takes bootstrap sampled data sets and splits them up, submitting each to a different computer, so as to run bootstrapping quickly on networks of computers. It is available free as C source code by ftp from megasun.bch.umontreal.ca in directory pub/parboot. It requires a networked system of computers with PHYLIP, a Perl interpreter, and appropriate accounts and permissions. Naoko Takezaki (ntakezak@lab.nig.ac.jp) of the Center for Information Biology of the National Institute of Genetics, Mishima, Japan, has written Lintre (Phylogenetic tests of the molecular clock and linearized tree), a package of programs for Sun workstations. The programs include: q njboot -- construct a neighbor-joining (NJ) tree q postree -- create a postscript file of trees q tpcv -- conduct the two-cluster test q branch -- conduct the branch length test q branbst -- conduct the branch length test by bootstrap The two-cluster test is essentially the relative rate test for many sequences. The branch length test is the test of rate difference for each sequence under the tree root from the average rate of all sequences. The tests are described in: Takezaki, N., A. Rzhetsky, and M. Nei. 1995. Phylogenetic test of the molecular clock and linearized trees. Molecular Biology and Evolution 12: 823-33. The programs are available as C source code and also as DOS executables. The are distributed (as a compressed tar archive of the source code with examples and documentation, and also as a self-extracting archive of sources and DOS executable) from her web site at http://cib.nig.ac.jp/dda/ntakezak.html and by anonymous ftp from the IUBio server ftp.bio.indiana.edu in directory molbio/evolve. Naoko Takezaki (ntakezak@lab.nig.ac.jp) of the Center for Information Biology, National Institute of Genetics, Mishima, Japan, has written njbafd which constructs a neighbor-joining tree or a UPGMA tree from microsatellite data and other allele frequency data. Bootstrapping can be carried out. The program includes Goldstein et al.'s distance for microsatellite loci. There are is a source code Unix version and an executable DOS version. They are available from her web site at http://cib.nig.ac.jp/dda/ntakezak.html. They are also available by ftp in directory molbio/evolve of the IUBIO archive.
http://evolution.genetics.washington.edu/phylip/software.etc1.html (5 of 8) [14/11/2000 5:28:22 pm]
Phylogeny Programs (continued)
Joaquin Dopazo of the Bioinformatics department of GlaxoWellcome SA, Spain (jd19662@glaxowellcome.co.uk) has written WET (Windows Easy Tree), version 1.3, which is an easy-to-use program for inferring phylogenies from sequence data by distance matrix methods. The main goal in the development of WET was to make a really user friendly program able to interact with other phylogenetic packages. WET can import files of a number of different formats. It calculates distances by a number of different methods and constructs phylogenetic trees using neighbor-joining, UPGMA and WPGMA procedures. It is a Windows95/98/NT executable. It is available from its web site at http://www.cnb.uam.es/~dopazo/software/wet.html. David Penny (Institute of Molecular Biosciences, Massey University, Palmerston North, New Zealand) has been offering for free distribution two DOS programs, one a fast parsimony program, TurboTree. There is also another, Great Deluge, an approximate search for the most parsimonious tree by a quasi-random method. He tells me that funding exigiencies are such that he may soon have to start charging for these. His electronic mail address is dpenny@massey.ac.nz. David Penny of the Institute of Molecular Biosciences, Massey University, Palmerston North, New Zealand (dpenny@massey.ac.nz), has made available through his Farside Institute three programs, Hadtree, Prepare, and Trees. These run on DOS systems, and compute bipartition spectra by Hadamard transformations (conjugations and the distance Hadamard), character weighting, distance transformations (including LogDet), base composition tests, resampling schemes, and tree selection. The programs are available from the Farside Institute downloads page at http://imbs.massey.ac.nz/Research/MolEvol/Farside/programs.htm. David Swofford, of the Laboratory of Molecular Systematics of the Smithsonian Institution, Washington, D.C., has written Freqpars. It implements parsimony analysis based on gene frequencies. The method was described by D. L. Swofford and S. H. Berlocher in a paper in Systematic Zoology 36: 293-325, 1987. The program is available in FORTRAN 77 source code. The search for most parsimonious trees under Swofford and Berlocher's criterion is not very extensive, Swofford notes, because the individual tree evaluations are computationally difficult. The source code in FORTRAN, with documentation, is available by anonymous ftp from onyx.si.edu in directory freqpars. Jotun Hein, (Institute of Genetics and Ecology, University of Aarhus, 8000 Aarhus C, Denmark) has produced TreeAlign, a multiple sequence alignment program that builds trees as it aligns DNA or protein sequences. It uses a combination of distance matrix and approximate parsimony methods. TreeAlign uses too much memory for it to run on DOS or Macintosh systems but is really designed for a workstation or mainframe. It is available by anonymous ftp at the European Bioinformatics Institute molecular biology software distribution site ftp.ebi.ac.uk in directories pub/software/unix and pub/software/vms. Another multisequence alignment program that estimates trees as it aligns multiple sequences is ClustalW. Currently it is in version 1.8. It is distributed as C source code and as executables for DOS, Macintosh, and some Unix systems. ClustalW was written by Des Higgins (now at University College, Cork, Ireland) (des@chah.ucc.ie), Julie Thompson (julie@IGBMC.u-strasbg.fr), Toby Gibson, (Gibson@EMBL-Heidelberg.DE), and François Jeanmougin (pingouin@igbmc.u-strasbg.fr). It is a complete rewrite and upgrade of the Clustal and ClustalV packages. which were developed by Des Higgins. New features include the ability to detect read different input formats (NBRF/PIR, Fasta, EMBL/Swissprot); align old alignments; produce phylogenetic trees after alignment (Neighbor Joining trees with a bootstrap option); write different alignment formats (Clustal, NBRF/PIR, GCG, PHYLIP); full command line interface. It is described in the following papers: q Thompson, J.D., D. G. Higgins and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research 22: 4673-4680. q Higgins, D. G., J. D. Thompson, and T. J. Gibson. 1996. Using CLUSTAL for multiple sequence alignments. Methods in Enzymology 266: 383-402. ClustalW is available in a number of forms and places: q The program is available by anonymous ftp at the Indiana and EMBL molecular biology distribution sites: ftp.bio.indiana.edu and ftp.ebi.ac.uk. In the Indiana archive one must enter directory molbio/align, and in the EBI archive it is in directory pub/software in four directories unix/clustalw, vms/clustalw, mac/clustalw, and DOS/clustalw. These also contain the older ClustalV executables, as well as a version, ClustalX, that has a windowing interface. Clustal X is made available as executables for PowerMac, PC (32 bit), and UNIX (Linux, Alpha, SGI, Sun). q Both ClustalW and ClustalX are also available by ftp at ftp-igbmc.u-strasbg.fr in directory pub and there is a description of ClustalX on its web page at http://www-igbmc.u-strasbg.fr/BioInfo/ClustalX/Top.html and in a paper: Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 24: 4876-4882. q Silicon Graphics, Inc. (SGI) has available parallelized versions of ClustalX and ClustalW that are optimized to run on their R10000 parallel computers on IRIX 6.5. They are available for download at their web page at http://www.sgi.com/chembio/resources/clustalw/parallel_clustalw.html. E-mail contact at SGI for this version is Dmitri Mikhailov (dmitri@sgi.com). q For ClustalV, there exists a Macintosh Hypercard stack, ClustToTree, that can convert its tree files to Newick Standard format (used by many other programs). ClustToTree is made available by Kai-Uwe Fröhlich in Tübingen, Germany at http://yeamob.pci.chemie.uni-tuebingen.de/Archiv/ClustToTree.html. Ward Wheeler and David Gladstein (wheeler@amnh.org) have written MALIGN, version 2.7, a parsimony-based alignment program for molecular sequences. It implements the original suggestion by Sankoff, Morel, and Cedergren (1973) that alignment and phylogenies could be done at the same time by finding that tree that minizes the total alignment score along the tree. Jotun Hein's program TreeAlign (mentioned above) is another, more approximate but possibly faster, attempt to implement the Sankoff-Morel-Cedergren suggestion. MALIGN is one of the only non-approximate implementations of the original method (Wheeler and Gladstein's other program POY is the other). MALIGN is described in a paper: Wheeler, W. 1996. Optimization alignment: the end of multiple sequence alignment in phylogenetics? Cladistics 12:1-9. MALIGN is available by ftp from the American Museum of Natural History's anonymous ftp site, ftp.amnh.org, in directory pub/molecular. It is available as C source code and as binaries for DOS (with DOS extender), Sun, SGI, HPUX, and Linux. The C source code archive contains as well a special makefile for a version for parallel computation.
http://evolution.genetics.washington.edu/phylip/software.etc1.html (6 of 8) [14/11/2000 5:28:22 pm]
Phylogeny Programs (continued)
Hitachi Software Engineering Co. Ltd. and Molecular Biology Insights, Inc. of Cascade, Colorado, sell DNASIS, a general-purpose DNA and protein sequence analysis system. It has many functions including primer design, plasmid maps, contig assembly, alignment, database searching, and many kinds of protein plots. For our purposes what is relevant is the ability to do multiple sequence alignment by the Higgins-Sharp method of progressive seqeunce alignment (the one used in ClustalV), with one of the results being a UPGMA tree based on pairwise sequence alignment scores. DNASIS is available as Macintosh executables (MacDNASIS version 3.7) and as DNASIS for Windows version 2.6. A description and demo versions are available at its Hitachi web site at http://www.hitachi-soft.com/gs/dnasis/index.htm and at its MBI web site at http://oligo.net/dnasis.htm. Both versions cost $1,895 or $3,000 for a 1-10 user network license.
Karl Nicholas (ketchup@cris.com) and Hugh Nicholas (nicholas@psc.edu) of Pittsburgh Supercomputing Center have written GeneDoc, version 2.5, a program for the shading and editing of multiple sequence alignments. Its reads .MSF files and Fasta Files. The alignment can be edited by changing the position of residues in the sequences. GeneDoc includes scoring functions to assist in determining whether your aligment changes are improving the score. Support for obtaining a score via sum-of-pairs or by a phylogenetic tree is included. Phylogenetic trees can be built with either the GUI interface or imported Nexus or Phylip format tree descriptions. The program runs on Windows 3.1, Windows95, and Windows/NT, as both 16-bit and 32-bit executables are distributed. It can be downloaded from its Web site at http://www.cris.com/~ketchup/genedoc.shtml. A Windows NT version for Digital Alpha processors is available from Russell Malmberg at the Botany Department of the University of Georgia (russell@dogwood.botany.uga.edu) by World Wide Web at http://dogwood.botany.uga.edu/malmberg/software.html Feng Liu and Tao Jiang (jiang@church.dcss.McMaster.CA), of the Department of Computing and Software at McMaster University, Hamilton, Ontario, have written TAAR (Tree Alignment And Reconstruction), version 1.0, which constructs multiple sequence alignment and phylogenies based on the idea of tree alignment. It is a graphical environment capable of "approximately optimal" parsimony-based tree alignment. It can also infer trees by parsimony. It can handle DNA or protein data. It is available as C source code for Unix with X Windows (X11R5) and MOTIF 1.2. It is also available in a version for Linux with Lesstif. It is distributed through its home page at http://www.dcss.mcmaster.ca/~fliu/taar_download.html. David States (states@ibc.wustl.edu) of the Institute for Biomedical Computing, Washington University, St. Louis, Missouri, has released Ctree version 1.0., a tree alignment program that uses a Hidden Markov Model method of representing the ambiguities in alignments of groups of sequences. The Ctree program is based on a neighbor joining algorithm in which sequences and groups of sequences are represented by Hidden Markov Models. HMMs are aligned using a Smith/Waterman dynamic programming algorithm to find the best local alignment. At each step in building the MSA alignment tree, the highest scoring pair of HMMs are merged into a new HMM. In this sense it is similar to the progressive alignment algorithm used in ClustalW, but the use of an HMM to represent clusters retains more information about the ambiguities than the Clustal algorithm does. It is possible to write the tree to a dendrogram output file. Ctree is available by anonymous ftp from www.ibc.wustl.edu in directory pub/ctree. It is available as C source code and also as executables for Solaris, SGI, Linux, and Windows95/NT. Ctree is also useable as a server but that version seems not to give trees as output. Ward Wheeler and David Gladstein (wheeler@amnh.org) of the American Museum of Natural History, New York, have written POY, a program that implements David Sankoff's method of searching for the tree that minimizes a parsimony criterion that includes penalties for gaps, accomplishing both searching for phylogenies and alignments. POY has algorithmic improvements by Wheeler and Gladstein that speed up the algorithm. (Their program MALIGN is the only other program carrying out the full Sankoff proposal). The method is described in a paper: Wheeler, W. 1996. Optimization alignment: the end of multiple sequence alignment in phylogenetics? Cladistics 12:1-9. POY is available in C source or in executables for Linux, HPUX, SGI, Sun, and DOS. It is distributed by ftp from ftp.amnh.org in directory pub/people/wheeler/poy. Russell Doolittle (rdoolittle@ucsd.edu) and Dafei Feng, of the Department of Chemistry and Biochemistry of the University of California at San Diego, released ALIGN in 1990. A version for Macintoshes was coded by Peter Markeiwicz. ALIGN implements the "progressive alignment" strategy described in their paper: Feng, D.-F. and R. F. Doolittle. 1987. Progressive sequence aligment as a prerequisite to correct phylogenetic trees. Journal of Molecular Evolution 25: 351-360. This is also the basis for the Clustal family of programs as well as the Pileup program in the GCG package. The ALIGN program can align as well as print out a tree (which does not have branch lengths). It uses Doolittle's own formats, and so three other programs are included with ALIGN to convert formats. The programs are distributed by ftp from the EBI ftp software server at ftp.ebi.ac.uk in directory pub/software/mac as file align.hqx.
Pietro Lio, of the Department of Genetics, University of Cambridge (P.Lio@gen.cam.ac.uk), has written PASSML and PASSML_TM, which use likelihood methods with Hidden Markov models to infer phylogeny and also secondary structure from protein data. PASSML is for general proteins and PASSML_TM is for membrane proteins. The methods used are described in the papers: Goldman, N., J. L. Thorne, and D. T. Jones. 1998. Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149: 445-458, PASSML is described in the paper: Lio, P., N. Goldman, J. L. Thorne and D. T. Jones. 1998. PASSML: combining evolutionary inference and protein secondary structure prediction. Bioinformatics 14: 726-733, and PASSML_TM is described in the paper: Lio, P. and N. Goldman. 1999 Using protein structural information in evolutionary inference: transmembrane proteins. Molecular Biology and Evolution 16: 1696-1710. The programs are available as ANSI C for Unix workstations; PASSML has also been successfully compiled under a port of gcc to a PC. The source code is available via the web page at http://ng-dec1.gen.cam.ac.uk/hmm/Passml.html.
Rod Page (dpage@udcf.gla.ac.uk) of the Division of Environmental and Evolutionary Biology of the University of Glasgow has written COMPONENT version 2.0, a program for Windows systems for comparing cladograms for use in phylogeny and biogeography studies. It has many tree comparison and consensus methods, and far more features for biogeographic studies (such as comparing species and area cladograms) than any other package. It also can generate random trees. It runs under Windows 3.0 or higher. Its cost is 40 pounds U.K., and it can be ordered from Anna Hutson at the Department of Botany, Natural History Museum, Cromwell Road, London,
http://evolution.genetics.washington.edu/phylip/software.etc1.html (7 of 8) [14/11/2000 5:28:22 pm]
Phylogeny Programs (continued)
SW7 5BD, U.K. (annah@nhm.ac.uk) (fax (0171) 938 9260). Details on how to order may be found at the order form web page at http://taxonomy.zoology.gla.ac.uk/rod/order.html. There is a review of the program in Cladistics 9: 351-353 (1993). COMPONENT has a web site at http://taxonomy.zoology.gla.ac.uk/rod/cpw.html. The documentation is available in Adobe Acrobat at http://taxonomy.zoology.gla.ac.uk/rod/cplite/Manual.html. A very early development Macintosh version ("COMPONENT Lite") is available free from the COMPONENT Lite web site at http://taxonomy.zoology.gla.ac.uk/rod/cplite/guide.html (though Rod says "be prepared for bugs").
Rod Page(dpage@udcf.gla.ac.uk), of the Division of Environmental and Evolutionary Biology of the University of Glasgow has written TREEMAP, a free, experimental program for comparing host and parasite phylogenies. It allows you to interactively compare host and parasite trees, construct reconstructions of the history of the association, and perform some simple randomisation tests of hypotheses of cospeciation. The program is available as an executable for Macintoshes or an executable for Windows PCs (the two versions are essentially identical). They can be downloaded from its WWW site: http://taxonomy.zoology.gla.ac.uk/rod/treemap.html. The site also has an online manual, or you can download the documentation as a Postscript file. For a description of the method used by TreeMap, see Page, R.D.M. 1994. Parallel Phylogenies: Reconstructing the history of host-parasite assemblages. Cladistics 10: 155-173. To next section of software page
http://evolution.genetics.washington.edu/phylip/software.etc1.html (8 of 8) [14/11/2000 5:28:22 pm]
Phylogeny Programs (continued)
To go to top of Software page To previous part of Software page
Tatsuya Ota (imeg@psuvm.psu.edu) of the Institute of Molecular Evolutionary Genetics and Pennsylvania State University has written a package, DISPAN, (Genetic Distance and Phylogenetic Analysis), which computes for gene frequency data the heterozygosity, gene diversity, Nei's standard genetic distance or the DA distance, and their standard error. It also constructs phylogeneties using the neighbor-joining (NJ) method or the UPGMA method. These trees can also be bootstrapped. A tree editor allows the user to rearrange the tree and print it out. The package consists of two programs, GNKDST and TREEVIEW. The first is a rewrite of a program by A. K. Roychoudhury, Y. Tateno, D. Graur, N. Saitou, and R. Schwartz, the second was written by Koichiro Tamura. DISPAN is distributed as DOS executables and is available by ftp from ftp.bio.indiana.edu in directory molbio/ibmpc as files dispan.*. Sudhir Kumar, (IMEG@PSUVM.PSU.EDU), of the Institute of Molecular Evolutionary Genetics, Pennsylvania State University, has written PHYLTEST, version 2.0. It is a DOS executable program for testing phylogenetic hypotheses about four clusters of DNA sequences. It implements comparison of three alternative phylogenetic trees for four monophyletic clusters of sequences, the four-cluster analysis: Rzhetsky, A, S. Kumar, and M. Nei. 1995. Four-cluster analysis: a simple method to test phylogenetic hypotheses. Molecular Biology and Evolution 12 163-167. It can also carry out the interior branch test of the null hypothesis that an interior branch length is significantly longer than zero (Rzhetsky, A. and M. Nei. 1992. A simple method for estimating and testing minimum-evolution trees. Molecular Biology and Evolution 9: 945-967), as well as the estimation of average pairwise distances (and standard errors) within and between clusters of sequences and relative rate tests and the computation of the time of divergence. PHYLTEST is distributed by anonymous ftp from ftp.bio.indiana.edu in directory molbio/ibmpc. It is distributed as a self-extracting archive, containing the executables and examples. The program can be run under DOS or in the DOS box of Windows 3.1, Windows95, or Windows NT.
http://evolution.genetics.washington.edu/phylip/software.dist.html (1 of 11) [14/11/2000 5:28:29 pm]
Phylogeny Programs (continued)
TREECON version 1.3b is a software package developed by Yves Van de Peer (yvdp@uia.ua.ac.be) for the construction and drawing of phylogenetic trees based on distance data. Several equations are included to convert dissimilarity into evolutionary distance and several methods (such as neighbor-joining) are included for inferring the tree topology. It also includes bootstrap analysis. The DOS version of the program is available for free and runs on 80386 (and higher) computers. It was described in CABIOS 9: 177-182 (1993). It is available from its web site at http://alt-www.uia.ac.be/u/yvdp/treecond.html. A 32-bit Windows version, which will work on Windows95, Windows98, and WindowsNTT and on Windows 3.1 systems that have the Win32 32-bit extensions loaded, is also available. It also has good facilties for drawing trees. It was announced in the paper: Van de Peer, Y. and R. De Wachter. 1994. TREECON for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment. Computer Applications in the Biosciences (CABIOS) 10: 569-570. A fee of 65 Euros is asked for it. A demonstration version of the Windows version and more information about TREECON can be found at http://www.uia.ac.be/u/yvdp/treeconw.html or you can contact the author at the Department of Biochemistry, University of Antwerp (UIA), Universiteitsplein 1, B-2610 Antwerpen, Belgium. Andrey Rzhetsky and Masatoshi Nei of the Institute of Molecular and Evolutionary Genetics at Pennsylvania State University (Rzhetsky is now at the Department of Medical Informatics, Columbia University, New York
http://evolution.genetics.washington.edu/phylip/software.dist.html (2 of 11) [14/11/2000 5:28:29 pm]
Phylogeny Programs (continued)
City) have produced METREE version 1.2, a program for carrying out the minimum-evolution distance matrix method. METREE is written in Turbo C 2.0 and runs on IBM-compatible personal computers (PC/AT compatible and above) that have a math coprocessor. It computes minimum evolution distance matrix trees from DNA and amino acid sequence data and tests the statistical significance of topological differences and of the branch lengths. Different distance matrix measures may be used. The package is menu driven and the TREEVIEW program written by Koichiro Tamura for visualizing and printing out the final tree is also included. The method is described in the paper by A. Rzhetsky and M. Nei. 1992. A simple method for estimating and testing minimum-evolution trees. Molecular Biology and Evolution 9: 945-967, and the program is described in a paper by A. Rzhetsky and M. Nei. 1994. METREE: a program package for inferring and testing minimum-evolution trees. Computer Applications in the Biological Sciences 10: 409-12. METREE is described in the Web page at http://www.bmb.psu.edu/597a/students/btn1/METREE.HTM and is distributed by anonymous ftp at ftp.bio.indiana.edu in directory molbio/ibmpc. Rzhetsky's e-mail address is ar345@columbia.edu. Igor Belyi (Igor_Belyi@transarc.com) has developed TreePack, a minimum evolution program for Unix workstations. TreePack can be obtained by ftp from ftp.cse.psu.edu in directory pub/belyi. It is available as Unix source code in C. Naruya Saitou of the Laboratory for Evolutionary Genetics, National Institute of Genetics, Japan (nsaitou@genes.njg.ac.jp) has produced TreeTree, a set of programs for neighbor-joining distance matrix analysis with bootstrapping. Macintosh executables are provided, and documentation and Pascal or C source code is provided in the package. The package consists of three main programs: NJ, a standard neighbor-joining program, NJorg, which makes an unrooted neighbor-joining tree, and bootNJ, which bootstraps the analysis, given a data file with multiple distance matrices, one for each bootstrap replicate. It can be downloaded by anonymous ftp from ftp.nig.ac.jp in directory pub/mac/TreeTree. Olivier Gascuel (gascuel@lirmm.fr) of the Laboratoire d'Informatique, de Robotique et de Micro-Electronique de Montpellier (LIRMM) of the Universite de Montpellier II, France has written BIONJ, an improved version of Neighbor-Joining based on a simple model of sequence data. It follows the same agglomerative scheme as NJ but uses a simple, first-order model of the variances and covariances of evolutionary distance estimates. This model is appropriate when these estimates are obtained from aligned sequences. It retains the speed advantages of Neighbor-Joining while using a slightly different criterion to select pairs of taxa to join, one which will perform better when distances between taxa are large. It is described in the paper: Gascuel, O. 1997. BIONJ: An Improved Version of the NJ Algorithm Based on a Simple Model of Sequence Data. Molecular Biology and Evolution 14: 685-695. C source code and Sun, DOS, and Macintosh executables of BIONJ are available at its web page at http://www.crt.umontreal.ca/~olivierg/bionj.html. Olivier Gascuel and Denise Levy (gascuel@lirmm.fr) at the the Laboratoire d'Informatique, de Robotique et de Micro-Electronique de Montpellier (LIRMM) of the Universite de Montpellier II, France have produced QR2 version 1.0, a program which approximates a dissimilarity (or distance) matrix by a tree. The method is described in a paper: Gascuel, O. and D. Levy. 1996. A reduction algorithm for approximating a (nonmetric) dissimilarity by a tree distance. Journal of Classification 13: 129-155. The program is available in C++ source code by ftp from lirmm.lirmm.fr in directory pub/genome/phylo. It is also available as a server from the Institut Pasteur.
http://evolution.genetics.washington.edu/phylip/software.dist.html (3 of 11) [14/11/2000 5:28:29 pm]
Phylogeny Programs (continued)
William C. Black of the Department of Microbiology, Colorado State University (wcb4@lamar.colostate.edu ) makes available BIOSYS-2. This is a modified version of David Swofford and Richard B. Selander's 1981 program BIOSYS-1, adding some features. Swofford's program was originally distributed by the Illinois Natural History Survey but has not been in distribution for some years -- this is the only version in distribution. Although in many respects it has been superseded by other population genetics packages (such as the ones that follow this listing) it computes gene frequencies, linkage disequilibria, and many other population genetics analyses on electrophoretic genotypes. For our purposes, it is most relevant to note that it can compute the genetic distances of Nei (unbiased distance), Rogers, and Cavalli-Sforza. It can also carry out UPGMA, WPGMA, Complete Linkage, and Single Linkage clustering of populations (inferrring clocklike phylogenies), or infer nonclocklike phylogenies by Farris's Distance Wagner method. Black has added the capabilities of making bootstrap replicates of the genetic distances and writing them out in PHYLIP format. BIOSYS-2 is available as a DOS executable with documentation and FORTRAN source code. It is distributed by ftp from lamar.colostate.edu in directory pub/wcb4. William J. Bruno of the Los Alamos National Laboratory (billb@lanl.gov) has released nneighbor, a modification of the PHYLIP Neighbor-Joining distance matrix program that avoids negative branch lengths (its name means Non-Negative Neighbor). The program is available as generic C code. It is available at one of Bruno's web pages at http://www.t10.lanl.gov/billb/related_links.html. William J. Bruno, Nicholas D. Socci, and Aaron L. Halpern of the Los Alamos National Laboratory (billb@lanl.gov) have produced weighbor (WEIGHted neighBOR joining), version 1.0.1, a distance matrix program for performing a weighted version of the Neighbor-Joining method. The weighting used is for nucleotide sequences and more correctly reflects the uncertainty of the longer distances in the tree than does ordinary Neighbor-Joining. It is thus closer to approximating maximum likelihood and will be more accurate than Neighbor-Joining on large trees. It is described in a paper: Bruno, W. J., N. D. Socci, and A. L. Halpern 2000. Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction. Molecular Biology and Evolution 17: 189-197. Weighbor is available as C source code and as PowerMac and DOS executables from its web site at http://www.t10.lanl.gov/billb/weighbor/index.html. Paul Lewis and Dmitri Zaykin, then of North Carolina State University (but Lewis is now at the University of Connecticut) (plewis@uconnvm.uconn.edu), have written GDA version 1(d13), a set of programs to carry out many of the statistical methods for analyzing gene frequencies and sequence data that are described in Bruce Weir's book Genetic Data Analysis II (Sinauer Associates, Sunderland, Massachusetts, 1996). The programs run under Windows and include the calculation of UPGMA and Neighbor-Joining phylogenies. The program is described in a Web site maintained by Paul Lewis at http://alleyn.eeb.uconn.edu/gda/ There are two versions of GDA: one is a 16-bit version suitable for all Windows platforms (Windows 3.x, Windows 95, Windows NT) a second is a 32-bit version suitable only for Windows 95, Windows 98 or Windows NT.
Mark Miller (mpm2@nauvax.ucc.nau.edu) of the Department of Biological Sciences, Northern Arizona University, Flagstaff, Arizona, has written TFPGA (Tools For Population Genetics Analysis), A Windows program for the analysis of allozyme and molecular population genetic data. It can calculate genetic
http://evolution.genetics.washington.edu/phylip/software.dist.html (4 of 11) [14/11/2000 5:28:29 pm]
Phylogeny Programs (continued)
distances. In addition, this program calculates descriptive statistics, and F-statistics, and performs tests for Hardy-Weinberg equilibrium, exact tests for genetic differentiation, Mantel tests, and UPGMA cluster analyses. Additional features include the ability to analyze hierarchical data sets as well as data from either codominant markers such as allozymes or dominant markers such as AFLPs or RAPDs. It is available from his web page at http://herb.bio.nau.edu/~miller/ as a Windows executable.
François Bonhomme (genetix@crit.univ-montp2.fr) has released Genetix version 4.0. This is a Windows95/98/NT executable program that does a wide variety of population genetic procedures. The part relevant to the present list is that it computes the Nei and the Cavalli-Sforza genetic distances, both with and without bias correction. It also calculates F statistics and linkage disequilibrium, and performs permutation tests on the results. One limitation (or advantage, depending on your perspective) is that the interface is in French. Genetix is available from its web site at http://www.univ-montp2.fr/~genetix/genetix.htm. William C. Black of the Department of Microbiology, Colorado State University (wcb4@lamar.colostate.edu) has produced PROGRAMS FOR ANALYSIS OF RAPD-PCR DATA. There are 7 programs. The ones relevant to this listing are q RAPDPLOT which computes distances between RAPD patterns using either Nei and Li's distance or a percent difference score, q RAPDBOOT which makes these distance matrices in while bootstrap sampling the data set, and q RAPDBIOS which converts the RAPD data into a gene frequency dataset for analysis by BIOSYS-2. (Black's site is also the distribution site for BIOSYS-2). Other programs compute FST and linkage disequilibria from RAPD data. The programs are in FORTRAN source code with DOS executables. They are distributed by ftp from lamar.colostate.edu in directory pub/wcb4 as file RAPDS.ZIP. María Jesús Martín and Joaquín Dopazo at the R&D Department of TDI (TDI-EMBNet), Spain, (martin@tdi.es or dopazo@tdi.es) have developed OSA (Optimal Sequence Analysis), version 2.0. It finds, whithin large sequences, those regions with an information content similar to that of the whole sequence and it selects, among them, the shortest ones. This program was formerly called ORF. The algorithm used is based on comparing pairwise genetic distances, calculated for windows of variable size and position, to the distance matrix obtained for the whole sequence. Either uncorrected genetic distances or Jukes-Cantor distances can be used. Two methods are used to set cutoff levels: simulation-based significance values or bootstrapping. A variety of options for search among possible windows are available. The method has been described in a paper: M. J. Martín, F. Gonzalez-Candelas, F. Sobrino and J. Dopazo. 1995. A method for determining the position and size of optimal sequence regions for phylogenetic analysis. Journal of Molecular Evolution 41: 1128-1138. OSA uses aligned sequences in a number of common formats as input. It runs on UNIX based machines. Currently Gnu Pascal source code and also executable versions for Solaris and IRIX operating systems are available. The program can analyze up to 50 sequences of a maximum length of 10,000 bp. It can be obtained at its website http://www.tdi.es/programas/osa-i.htm. It can also be obtained by ftp from ftp.ebi.ac.uk in directory pub/software/unix/osa, and from ftp.no.embnet.org in directory pub/programs/dist/OSA. In all of these places the file names are osa-solaris.2.4.tar.Z for the Solaris version, and osa-irix.5.3.tar.Z for the Irix version.
http://evolution.genetics.washington.edu/phylip/software.dist.html (5 of 11) [14/11/2000 5:28:29 pm]
Phylogeny Programs (continued)
Joyce Miller (jmiller@genome.wi.mit.edu) has written RESTSITE, version 1.2, a package of DOS programs for computing distances between species based on restriction sites or restriction fragments. The programs also include NJTREE and UPGMA which can infer phylogenies by the Neighbor-Joining and UPGMA distance matrix methods. The programs are written in Microsoft C: source code is available too. The programs, documentation, and source code are distributed by its Web site, http://www-genome.wi.mit.edu/~jmiller/restsite.htm. The programs were described in a paper: Miller, J. C. 1991. RESTSITE: A phylogenetic program that sorts raw restriction data. Journal of Heredity 82: 262-263. Doug McElroy (Doug.McElroy@wku.edu) of Western Kentucky University distributes REAP, the Restriction Enzyme Analysis Package, written by him, Paul Moran, Eldredge Bermingham, and Irv Kornfeld. REAP can calculate distances from restriction sites, restriction fragments data, and from nucleotide sequences (the Kimura 2-parameter distance). REAP is a package of DOS executables available from McElroy's web site. at http://bioweb.wku.edu/faculty/mcelroy/. It is described in the paper: McElroy, D., P. Moran, E. Bermingham, and I. Kornfield. 1992. REAP: An integrated environment for the manipulation and phylogenetic analysis of restriction data. Journal of Heredity 83: 157-158. Ken Rice (krice@saul.cis.upenn.edu) of SmithKline Beecham, Upper Merion, Pennsylvania (formerly of the University of Pennsylvania) has produced AMP (Accepted Mutation Parsimony), a program which calculates stepmatrices for protein parsimony analysis, for use in PAUP* and MacClade. It uses transition probabilities under models of protein evolution to calculate these stepmatrices. It is available as C source code for Unix, from its web site at: http://phylofarm.bio.upenn.edu/phylofarm/amp.html It is also available by anonymous ftp from phylogeny.harvard.edu in directory pub/rice/amp. Genetics Computer Group, Inc. ("GCG") of Madison, Wisconsin, produces the Wisconsin Sequence Analysis Package, version 10.0, (usually called "the GCG Package"), a leading package of sequence search and analysis programs, together with updates of the leading sequence databases. Included are programs for tree-based multiple sequence alignment, calculation of distances, and estimating phylogenies by the neighbor-joining and UPGMA distance matrix methods: q PileUp creates a multiple sequence alignment of up to 500 sequences using the method of Feng and Doolittle, similar to the ClustalW method of Higgins and Sharp. However Pileup uses a UPGMA clustering instead of Neighbor-Joining clustering, and does not allow as much flexibility in substitution matrices as ClustalW. A dendrogram illustrating sequence similarity is also created. q Distances writes a matrix of the pairwise evolutionary distances between aligned sequences. To correct for multiple substitutions several methods may be chosen: for nucleic acid sequences, Kimura's two-parameter method, the Tajima and Nei method, and the Jin and Nei method; for protein sequences, the Kimura method, and for either type of sequence the Jukes-Cantor method. q GrowTree creates a phylogenetic tree using the neighbor-joining method or UPGMA. Future versions of the GCG package will contain an updated version of the GDE front end, and also will be able to act as a front end to PAUP*. (PAUP* will be included in the package in the near future). The GCG package runs on Digital Alpha AXP workstations running OpenVMS or Digital Unix, Silicon Graphics workstations, Sun workstations, and Digital VAXes running VMS. GCG's address is Customer Relations, Genetics Computer Group, Inc., 575 Science Drive, Madison, Wisconsin 53711. Their e-mail address is info@gcg.com. Their
http://evolution.genetics.washington.edu/phylip/software.dist.html (6 of 11) [14/11/2000 5:28:29 pm]
Phylogeny Programs (continued)
telephone and fax numbers are (608) 231-5200 and (608) 231-5202. The programs are described on a company Web page at http://www.gcg.com/products/software.html. Prices are no longer given on the web site (you are asked to contact them). The most recently posted prices were $5,000 (plus $3,000 per year thereafter) for an academic installation ($18,000 and $6,000 for a nonacademic installation). These have probably changed since then. Leader Vision Software House, (P.O. Box 9150, Stanford, California 94309-9150, USA) distributes DENDRON, a computer-assisted system for analyzing DNA fingerprinting gels. It reads and compares gel images. One feature is an average-linkage clustering algorithm that can produce trees from the gel images. For information and pricing, contact Leader Vision. Their web page is at http://www.leadervision.com/software.htm. BioRad, division of Sadtler USA, Inc., (Sadtler_USA_Sales@bio-rad.com) distributes Molecular Analyst Fingerprinting, a package for quantitative RFLP and Fingerprinting analysis. The package is available for PC and Macintosh, and includes average linkage clustering of the gel patterns. The web page for this software is at http://www.bio-rad.com/1013018.html. Bio-Rad Life Sciences Research Group, 2000 Alfred Nobel Drive, Hercules, California 94547, and their phone number is 1-800-977-8437. Other contact information is available on their web page. James McInerney (J.McInerney@nhm.ac.uk), of The Natural History Museum, London, has written GCUA (General Codon Usage Analysis). It does codon usage and amino acid usage statistics, and also performs correspondence analysis/principle components analysis on both codon usage and amino acid usage statistics. Its relevance to the present list is that it also produces a distance matrix, based on Relative Synonymous Codon Usage (RSCU) statistics, whose format is PHYLIP/PAUP*4.0 -compatible. Although McInerney cautions that this matrix should not be used for phylogenetic inference, I wonder whether this distance does not have some phylogenetic information. It is available as SunOS, PowerMac, and Linux binaries at the moment, and will be followed soon by DOS, Windows, SGI, DEC and other binaries. The code isn't available yet, "because it is so embarassingly poor". It can be retrieved via anonymous ftp from ftp.nhm.ac.uk in directory pub/gcua Mathieu Blanchette and David Sankoff (blanchet@hans.crm.umontreal.ca) of the Centre de Recherches Mathématiques of the Université de Montréal, Quebec, Canada. have produced DERANGE2, a program to reconstruct the history of two gene maps using weighted inversions, transpositions and inverted transpositions. It can thus construct a set of distances based on the gene orders (not the sequences of the genes themselves). It is available as a standard C source code and can readily be compiled on Unix systems. It is available by anonymous ftp from ftp.ebi.ac.uk in directory pub/software/unix. Laurent Excoffier of the Laboratoire de Génétique et Biométrie of the Department of Anthropology and Ecology of the University of Geneva, Switzerland (Laurent.Excoffier@anthro.unige.ch) has produced minspnet, a program that produces a minimum spanning tree and network from a distance matrix. It is available as a Windows executable. It can be obtained from a web page which is a directory listing at http://anthropologie.unige.ch/LGB/software/win/min-span-net/ which lists the program file and a README file.
http://evolution.genetics.washington.edu/phylip/software.dist.html (7 of 11) [14/11/2000 5:28:29 pm]
Phylogeny Programs (continued)
Francis Yeh (francis.yeh@ualberta.ca) of the Department of Renewable Resources at the University of Alberta, Canada, has released POPGENE version 1.32, a free program for the analysis of genetic variation among and within populations using co-dominant and dominant markers. The feature that is relevant to the present list is that it can compute a number of genetic distances for gene frequencies. It is distributed as a Windows executable (which can run on Windows95, Windows 3.1 or Windows NT) from its Home Page at http://www.ualberta.ca/~fyeh/index.htm. Warren Kovach of Kovach Computing Services, Anglesey, Wales (info@kovcomp.co.uk) has produced MVSP, a comprehensive multivariate statistical package for the PC platform. It can do many kinds of analyses (principal components, clustering, etc.) but the features relevant to this listing are clustering with a variety of methods and a variety of distance measures, including Li and Nei's restriction sites distance. MVSP may be ordered from Kovach Software through its web site at http://www.kovcomp.com/mvsp/. MVSP 3.1 for Windows costs 85 pounds UK or US$ 140. Version 2.2 for DOS costs 65 pounds UK or US$ 100. Free evaluation versions which works for a limited period can be downloaded from the Kovach Computing download web page at http://www.kovcomp.co.uk/downl2.html#mvsp. An evaluation version of version 2.2 for DOS is also available for downloading by ftp from garbo.uwasa.fi in directory pc/stat/. MVSP is also distributed by Exeter Software at its web site at http://www.ExeterSoftware.com/cat/mvsp.html Version 3.0 costs $140, or $875 for a 10-user license or $1875 for a 25-user license. An evaluation disk is $15. Other vendors include Rockware and GeoMem. Simon Goodman, of the Institute of Cell, Animal, and Population Biology of the University of Edinburgh (simon.goodman@ed.ac.uk ) has produced RSTCALC, version 2.2. It is primarily intended to perform analyses of population structure, genetic differentiation and gene flow using microsatellite data. IT calculates estimates the Rst measure of differentiation among a number of populations, but in addition you can also use RSTCALC to obtain estimates of the delta-mu^2 distance measure. Its calculations are described in a paper: Goodman, S. J. 1997. Rst Calc: a collection of computer programs for calculating estimates of genetic differentition from microsatellite data and a determining their significance. Molecular Ecology 6: 881-885. The program runs on Windows95 and is available from its web site http://helios.bto.ed.ac.uk/evolgen/rst/rst.html as a Windows95 executable. William J. Bruno and Lars Arvestad (billb@t10.lanl.gov) of the Theoretical Biology and Biophysics Group at Los Alamos National Laboratory, have released DISTANCE, version 1.0. It estimates the most general reversible substitution matrix corresponding to a given collection of aligned DNA sequences. This matrix can then be used to calculate evolutionary distances between pairs of sequences. The method is described in a paper: Arvestad, L. and W. J. Bruno. 1997. Estimation of reversible substitution matrices from multiple pairs of sequences. Journal of Molecular Evolution 45: 696-703. The program is written in C, and distributed from its web site at http://www.t10.lanl.gov/evolution/, along with Sun SPARC binaries. Gaston Gonnet and Chantal Korostensky (korosten@inf.ethz.ch) of the at the Computational Biochemistry Research Group at ETH in Zürich, Switzerland, have made available Darwin, Data Analysis and Retrieval With Indexed Nucleotide/peptide sequences. It is an environment which enables the user to carry out a variety of kinds of analysis with sequences, including phylogeny methods These seem to include distance matrix,
http://evolution.genetics.washington.edu/phylip/software.dist.html (8 of 11) [14/11/2000 5:28:30 pm]
Phylogeny Programs (continued)
split decompositon, and a form of likelihood method. Darwin is available as executables for a variety of Unix workstations: DEC Alpha with Digital Unix, DEC MIPS with Ultrix, SGI R8000, HP PA-RISC machines, Sun SPARC machines, and Intel-based Linux. These are available free if the user registers by e-mailing to (sekwr@inf.ethz.ch) including their postal address. The executables can then be transferred to the user by ftp or by e-mail of encoded files. Details and distribution policies are explained further at Darwin's web page. Darwin is also made available as a server.
Vladimir Makarenkov and Philippe Casgrain (makarenv@ere.umontreal.ca and casgrain@ere.umontreal.ca) of the Département de Sciences Biologiques of the Université de Montréal have released T-REX (Tree Reconstruction). This program performs four methods of fitting an additive distance (distance in a nonclocklike tree) to a given dissimilarity. The methods available include Sattath and Tversky's ADDTREE method, Nei and Saitou's Neighbor-Joining method, Gascuel's UNJ Unweighted Neighbor-Joining method, the Circular order reconstruction method of Makarenkov and Leclerc (1997), and Yushmanov (1984), and the MW weighted least-squares method by Makarenkov (1997) and Makarenkov and Leclerc (1998). Executables for Macintosh ( the version 1.21a "fat" executable for both 68k Macs and PowerMacs) and for 32-bit DOS version that will work in a DOS window on Windows 95/98/NT are available at The T-REX web site at http://alize.ere.umontreal.ca/~casgrain/en/labo/t-rex/index.html. Lars Sommer Jermiin of the John Curtin School of Medical Research of the University of Canberra, Australia (lars.jermiin@anu.edu.au) has released K2WuLi version 1.0, a program to calculate the Kimura 2-parameter distance among DNA sequences, to compute its standard deviation, to carry out the relative rate test of Wu and Li (Wu, C.-I. and W.-H. Li. 1985. Evidence for higher rates of nucleotide substitution in rodents than in man. Proceedings of the National Academy of Sciences, USA 82: 1741-1745) , in the form suggested by Muse and Weir (Muse, S. W. and B. S. Weir. 1992. Testing for equality of evolutionary rates. Genetics 132: 269-276). The program is available as a DOS executable with Turbo Pascal course code as well from its web page at http://jcsmr.anu.edu.au/dmm/humgen/lars/k2wulitop.htm. David Posada (dp47@email.byu.edu) of Keith Crandall's lab at the Division of Ecology and Systematics of the Department of Zoology, Brigham Young University, Provo, Utah has produced MATRIX version 1.5, a program to calculate a matrix of pairwise distances (treating gaps as a fifth state by default): for absolute, uncorrected, JC69 and K80 distances from a set of aligned DNA sequences in PHYLIP sequential or Nexus format. It is available as a Macintosh executable from its web site at http://bioag.byu.edu/zoology/crandall_lab/programs.htm. Clare Constantine and colleagues at the Division of Veterinary and Biomedical Sciences at Murdoch University, Perth, Australia (constant@numbat.murdoch.edu.au) have written GeneStrut, a Macintosh program which computes a range of standard measures for the analysis of genetic structure from discrete genetic data. The input data are multilocus genotypes. It can calculate genotypic and allelic frequencies, statistics for Hardy-Weinberg disequilibrium, genetic diversity within populations, genetic identities between populations, and indices of population structure (F-statistics). For our purposes the important feature is that it can also calculate Nei's genetic distance between populations, with standard deviations. It is described in the paper: Constantine C. C., R. P. Hobbs and A. J. Lymbery. 1994. FORTRAN programs for analysing population structure from
http://evolution.genetics.washington.edu/phylip/software.dist.html (9 of 11) [14/11/2000 5:28:30 pm]
Phylogeny Programs (continued)
multilocus genotype data. Journal of Heredity 85: 336-337. It is available as a Macintosh executable, at its web site at http://wwwvet.murdoch.edu.au/vetschl/imgad/GenStrut.htm.
Laurent Excoffier of the Department of Anthropology of the University of Geneva, Switzerland (laurent.excoffier@anthro.unige.ch) has released Arlequin version 2.0, a program for population genetics analysis. It can perform many kinds of population genetic tasks including estimation of gene frequencies, testing of linkage disequilibrium, and analysis of diversity between populations. For the purposes of this list, the relevant feature is its ability to compute a variety of genetic distance measures including of Jukes and Cantor, the Kimura 2-parameter distance, and the Tamura-Nei distance, each of these with or without correction for gamma-distributed rates of evolution. It can also compute a Minimum Spanning Tree network. Arlequin has its interactive "front end" written in Java, and requires the Java Runtime Environment (which is available from the Arlequin site for those who do not already have it). The core routines are available as binaries for Windows95/98/NT/2000, for MacOS for the PowerPC processor, and for Linux for Intel-compatible x86 processors. The binaries, Java code, Java Runtime Environment, and a PDF documentation file are available at its web site at http://acasun1.unige.ch/arlequin/.
Julio Rozas and Ricardo Rosas of the Departament de Genetica, Universidad de Barcelona, Spain (julio@porthos.bio.ub.es) have released DnaSP version 3.0, a software package for the analysis of nucleotide polymorphism from aligned DNA sequence data. DnaSP can estimate several measures of DNA sequence variation within and between populations (in noncoding, synonymous or nonsynonymous sites), as well as linkage disequilibrium, recombination, gene flow and gene conversion parameters. It can also carry out several tests of neutrality: Additionally, it can estimate the confidence intervals of some test-statistics by the coalescent. The results of the analyses are displayed on tabular and graphic form. For the purposes of this web site, the relevant features are the calculation of some measures of population divergence, which can be used as distances in phylogeny reconstruction. DnaSP is described in the papers: q Rozas, J. and R. Rozas. 1995. DnaSP, DNA sequence polymorphism: an interactive program for estimating Population Genetics parameters from DNA sequence data. Computer Applications in the Biosciences (CABIOS) 11: 621-625. q Rozas, J. and R. Rozas. 1997. DnaSP version 2.0: a novel software package for extensive molecular population genetics analysis. Computer Applications in the Biosciences (CABIOS) 13: 307-311. q Rozas, J. and R. Rozas. 1999. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15: 174-175. It is distributed as a Windows95/98/NT executable from its web site at http://www.bio.ub.es/~julio/DnaSP.html. Allen Rodrigo, Alexei Drummond, and Matthew Goode of the Computational and Evolutionary Biology Laboratory, School of Biological Sciences, University of Auckland, New Zealand (a.rodrigo@auckland.ac.nz) have released vCEBL 0.3a the virtual Computational and Evolutionary Biology Laboratory. This is a graphical user interface around a functional programming language for evolutionary inferences. The system is written in Java. This alpha release provides the basic user interface and some component packages. The following analyses and tools are available in vCEBL 0.3a:
http://evolution.genetics.washington.edu/phylip/software.dist.html (10 of 11) [14/11/2000 5:28:30 pm]
Phylogeny Programs (continued)
Construction of serial sample phylogenies using sUPGMA or sWPGMA with sampling times known exactly or ordinally. q Construction of Neighbor-Joining, UPGMA and WPGMA phylogenies. q Estimation of pairwise distance matrices using user-specified rate matrices (but not yet allowing variation of rates between sites). q Estimation of population parameters including substitution/mutation rates using pairwise distances with or without parametric bootstrap confidence intervals. q Maximum-likelihood branch-length optimization of user-specified tree, including serial sampled clocklike trees. q ML estimation of divergence between serial samples assuming constant or varying mutation rates. q Simulation of genealogies and sequences under a constant-sized population model with or without serial sampling. This package supersedes this laboratory's separate release of sUPGMA, which has therefore been withdrawn. Self-installing versions of vCEBL for Macintosh or Windows 95/98 can be obtained from: its web page at http://www.cebl.auckland.ac.nz/pages/vcebl.html. It requires Java VM 1.1.1 or higher. It can also be obtained there as an applet for your browser, with some features lacking.
q
Naoko Takezaki of the Center for Information Biology of the National Institute of Genetics, Mishima, Japan ( ntakezak@lab.nig.ac.jp) has written sendbs. It computes average nucleotide substitutions within and between populations. The method is described in the paper by M. Nei and L. Jin (1989, Molecular Biology and Evolution 6: 290-300). However, sendbs differs from their method by using a bootstrap across sites obtain standard errors of the distances. It also constructs a tree of populations using a neighbor-joining method. It is distributed as source code for Unix, and also as a DOS executable, from her web site at http//www.cib.nig.ac.jp/dda/ntakezak.html#sendbs. To next section of software page
http://evolution.genetics.washington.edu/phylip/software.dist.html (11 of 11) [14/11/2000 5:28:30 pm]
Phylogeny Programs (continued)
To go to top of Software page To previous part of Software page
Web or e-mail servers that can analyze data for
you
For the moment we are giving only a casual description of each service. Congratulations to these groups for providing these free services. q GeneBee group of the Belozersky Institute, Moscow State University, Russia. Distance matrix analysis from protein or nucleotide sequences. q The Computational Biochemistry Group of the ETH in Zürich, Switzerland has three programs available on its server: r TreeGen, which uses a distance matrix method with distance matrix that you supply,
r
r
AllAll, which can either compute a PAM matrix from protein sequences or rooted or unrooted phylogenies using a distance method, and Darwin, an environment which has available within it distance matrix methods and split decomposition methods for phylogenies. Darwin is also available for distribution.
q
MultAlign a multiple-sequence alignment program that includes not only alignment but also several distance matrix methods for inferring trees. Phyltree, a Java applet that generates a random tree, a distance matrix for it, and uses clustering methods to estimate and display the resulting tree and the true one. From David Joyce of the Computer Science Department at Clark University, Worcester, Massachusetts.
r
q
The Suggest Tree function of the old web page of the Ribosomal Database Project at the University of Illinois (the main project has now moved to Michigan State University but so far has not yet implemented this server on its web page there). You submit a large- or small-subunit ribosomal RNA sequence and this server will align it with their database of sequences, place it in the best position it can on the tree of those sequences, and return the nearby part of that tree. A server at the Forschungsschwerpunkt Mathematisierung at the University of Bielefeld, Germany, that runs the SplitsTree 2 program. You paste in a lower-triangular distance matrix, or one in the Nexus format, and it returns the results as a file stored on your system. SplitsTree versions 1 and 2 are also available by ftp. Three groups have made available servers using PHYLIP: 1. The Internet Bioinformatics Group of the Internet Research and Development Unit of the National University of Singapore has a server at
q
q
http://evolution.genetics.washington.edu/phylip/software.serv.html (1 of 4) [14/11/2000 5:28:33 pm]
Phylogeny Programs (continued)
q
q
q
q
http://sdmc.krdl.org.sg:8080/~lxzhang/phylip/. 2. The Institut Pasteur, Paris has a server at http://bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html 3. Joaquin Dopazo at the Centro Nacional de Biotecnologia, Madrid, Spain has made a server available. The Ribosomal Database Project at Michigan State University has a server (located at http://www.cme.msu.edu/RDP/html/analyses.html that takes uploaded sequences of either large or small subunit ribosomal RNA, aligns them against its existing database of ribosomal RNA sequences, and can provide a similarity matrix between them. This is related to a distance matrix, but is not corrected for superimposed changes. David Robertson (david@igs.cnrs-mrs.fr) of the CNRS in Marseille has implemented TREE and FRAG_TREE, Neighbor-Joining servers that take a ClustalW .ALN file of aligned sequences, use ClustalW to align them and my program DRAWTREE to draw the resulting tree. TREE does the whole alignment, and FRAG_TREE a specified region of the alignment. The TREE server will be found at http://193.50.234.246/~beaudoin/anrs/Tree.html and the FRAG_TREE server will be found at http://193.50.234.246/~beaudoin/anrs/Frag.html. The European Bioinformatics Institite (EBI) has available a server for ClustalW . tree-based multiple alignment runs. This also produces tree files in Clustal's own format or in PHYLIP format. The Institut Pasteur in Paris has made available a server with many phylogeny programs. It is available through its web page in English at http://bioweb.pasteur.fr/seqanal/phylogeny/intro-uk.html, or its web page in French at http://bioweb.pasteur.fr/seqanal/phylogeny/intro.html. At present it includes web servers for r PHYLIP
r r r r r r r r r r
fastDNAml MOLPHY BIONJ TREE-PUZZLE Seq-Gen TreeAlign QR2 PhyloQuart LVB BAMBE
q
A server for the ROSE program which simulates the evolution of DNA, RNA, or proteins sequences with insertions and deletions is available from the Technische Fakultat of the University
http://evolution.genetics.washington.edu/phylip/software.serv.html (2 of 4) [14/11/2000 5:28:33 pm]
Phylogeny Programs (continued)
q
q
q
of Bielefeld, Germany at http://bibiserv.TechFak.Uni-Bielefeld.de/rose/. The INRA in Toulouse, France, has a server for Florence Corpet's MultAlin multiple sequence alignment method which is similar to progressive alignment. It clusters by UPGMA. A "cluster file" is provided which may contain information on the phylogeny. The Section of Molecular Evolution of the Institute of Cytology and Genetics in Novosibirsk, Russia has a server at http://molevol.bionet.nsc.ru/www_vs.htm for the VOSTORG package. The server can compute distances and find UPGMA, WPGMA, or Neighbor-Joining phylogenies. Emília Martins of the Department of Biology, University of Oregon, Eugene, Oregon has made available a server for her program COMPARE using Java. It carries out comparative methods analysis.
The ftp servers
There are three major collections of biology software available for transfer by ftp. These are q the IUBio archive maintained by Don Gilbert at the Biology Department of the University of Indiana. Its ftp address is ftp.bio.indiana.edu. There is also a web page at http://ftp.bio.indiana.edu/software/. Relevant code will be found in directories: r molbio in subdirectories s evolve for phylogeny software, s align for alignment software r and biology. and also in machine-specific directories where binaries and sources specific to those families of machines are to be found: r unix, r ibmpc, r mac, r vax, and r java A list of the software on the IUBio server can be found at http://iubio.bio.indiana.edu/soft/molbio/Listings.html. The links on that page are unfortunately not active but must be copied out by hand. q the EBI software archive at the European Bioinformatics Institute Cambridge, England (ftp.ebi.ac.uk in directory pub/software), and q the Pasteur Institute archive in Paris (ftp.pasteur.fr in directory pub/GenSoft/). These three archives often contain many of the same programs.
http://evolution.genetics.washington.edu/phylip/software.serv.html (3 of 4) [14/11/2000 5:28:33 pm]
Phylogeny Programs (continued)
Mirrors
The Indiana IUBio archive is mirrored (exact copies are maintained) in a number of countries: Japan r a full mirror is maintained at ftp.gdbnet.ad.jp in directory ftpsync/ftp.bio.indiana.edu r a mirror of the molbio section only is maintained at ftp.nig.ac.jp in directory pub/mirror/IUBIO/molbio Finland A mirror of the molbio section is maintained at ftp.funet.fi in directory pub/sci/molbio/iubiomolbio Sweden A mirror of the molbio section is maintained at ftp.sunet.se in directory pub/molbio Spain A mirror of the molbio section is maintained at ftp.uam.es at pub/mirror/molbio Israel A mirror of the search, unix, ibmpc, mac, and vax subdirectories of the molbio section is maintained at bioinformatics.weizmann.ac.il in directory pub/software France A mirror of the molbio section is maintained at ftp.pasteur.fr in directory /pub/GenSoft/mirrors/IUBio/molbio. U.K. A mirror of the molbio section is maintained at http://mic3.hensa.ac.uk/hosts/iubio.bio.indiana.edu/molbio/. India A mirror of the molbio section is available at imtech.chd.nic.in in directory /pub/mirror_sites/iubio/ (This is the bottom of the software listings).
... to the PHYLIP home page
http://evolution.genetics.washington.edu/phylip/software.serv.html (4 of 4) [14/11/2000 5:28:33 pm]
Phylogeny Programs (continued)
To go to top of Software page To previous part of Software page Mark Wilkinson, of the Department of Zoology, The Natural History Museum, London, U.K. (marw@nhm.ac.uk) has produced REDCON, a program to implement his method of reduced consensus trees. These find a tree with possibly fewer species that satisfies a strict or a majority rule consensus criterion. REDCON reads trees in PAUP* format. It is a DOS executable, and is available at his software Web site at http://www.bio.bris.ac.uk/research/markwilk/software.htm. Mark Wilkinson, of the Department of Zoology, The Natural History Museum, London, U.K. (marw@nhm.ac.uk) has produced TAXEQ2, a program to carry out Safe Taxonomic Reduction, which means dropping some species to get a set whose phylogenetic relationships are less ambiguous. The method is described in Wilkinson's Ph.D. thesis (1992, Department of Geology, University of Bristol) and an example of its use will be found in the paper by Wilkinson, M. 1995. Coping with missing entries in phylogenetic inferences using parsimony. Systematic Biology 44: 435-439. TAXEQ2 is distributed as a DOS executable with documentation and sample data set from his software Web site at http://www.bio.bris.ac.uk/research/markwilk/software.htm. Lars Jermiin and Olena Anpilogova ( jermiin@angis.usyd.edu.au), when Jermiin was at the Human Genetics Group, John Curtin School of Medical Research, Australian National University have produced TreeCons version 1.0. It generates a weighted consensus tree from trees obtained by maximum likelihood analysis, generates relative likelihood support on edges in this and other user-specified trees, and does the Kishino-Hasegawa test with any level of significance. It reads output files and tree files produced by some of the programs in PHYLIP, MOLPHY and TrExMl. The output file from TreeCons is in a format that then is fed back into PHYLIP's program Consense. A number of weighting schemes to compute tree weights from their likelihoods are allowed. The weighting schemes and the underlying theory are described in a paper: Jermiin L. S., G. J. Olsen, K. L. Mengersen, and S. Easteal. 1997. Majority-rule consensus of phylogenetic trees obtained by maximum likelihood analysis. Molecular Biology and Evolution 14: 1296-1302. TreeCons is distributed as C source code. It is available, with documentation and sample input and output, from its web site at http://jcsmr.anu.edu.au/dmm/humgen/lars/treeconssub.htm.
Joseph Thorley of the School of Biological Sciences, University of Bristol, U.K. (j.l.thorley@bris.ac.uk) and Rod Page of the University of Glasgow have written RadCon, a program to compute consensus trees. It is intended in future releases to also compute distances between trees. It can compute strict, Adams, and majority-rule consensus trees, and a number of others as well, including a consensus supertree method. It is currently available as a test version which expires at the start of the next month, so that a new version needs to be fetched once a month. This will change when the first full version is released. RadCon is a MacOS executable for MacOS 7.5 or later. RadCon is available at its web site at http://taxonomy.zoology.gla.ac.uk/~jthorley/radcon/radcon.html and its manual can also be downloaded or viewed from there. George Estabrook of the Department of Biology, University of Michigan, Ann Arbor, Michigan (Estabrook@umich.edu) has written QUARTET2, which calculates measures of difference between phylogenies based on quartets (subtrees of four tips). The methods are described in a paper: Estabrook, G. F. 1992. Evaluating undirected positional congruence of individual taxa between two estimates of the phylogenetic tree for a group of taxa. Systematic Biology 41: 172-177. QUARTET2 is available as a DOS executable from his web page of computer programs at http://www-personal.umich.edu/~gfred/.
Ross Crozier (genrhc@lure.latrobe.edu.au) of the Department of Genetics and Human Variation, Latrobe University, Bundoora, Victoria, Australia and and Paul-Michael Agapow (p.agapow@ic.ac.uk) have written CONSERVE, version 3.1.2, a Macintosh program to use phylogenetic information to calculate biodiversity and test the feasability of conservation schemes. It measures the distinctiveness of species using genetic distances and also to test whether particular assemblages of populations preserve statistically significantly more biodiversity than other assemblages. Biodiversity is determined using GD (probability of more than one allele) or PD (length of evolutionary history) methods, from data in the form of unrooted trees produced in standard treefile format. It is available as a Macintosh executable in a self-extracting archive from its web site at http://evolve.bio.ic.ac.uk/software/conserve/index.html or an alternative web site at http://www.cs.latrobe.edu.au/~agapow/Software/. George Weiller, of the Bioinformatics Group at the Research School of Biological Sciences of the Australian National University, Canberra (weiller@rsbs.anu.edu.au) has released TreeDis version 2.0. TreeDis finds the patristic distances (total length of branches between all pairs of taxa in a phylogeny. It takes as input the tree file in Newick standard form or in the format for NJTREE. It is distributed as a DOS executable (a C++ source code version can also be obtained from Weiller). It is available from its web site at http://life.anu.edu.au/molecular/software/tredis/.
Andrew Purvis of the Department of Biology, Imperial College, Silwood Park, U.K. (a.purvis@ic.ac.uk) and Andrew Rambaut (andrew.rambaut@zoo.ox.ac.uk) of the Department of Zoology, University of Oxford, England, have written CAIC (Comparative Analysis of Independent Contrasts), version 2.6. It is a Macintosh program that carries out the contrasts method (like my CONTRAST) but with some modifications by others to cope with lack of resolution of the phylogeny. It will run on any Macintosh, and is available free from CAIC's Web page http://evolve.bio.ic.ac.uk/software/caic/index.html or (in an earlier version) by anonymous ftp from directory packages/CAIC at evolve.zoo.ox.ac.uk. It is described in the paper by A. Purvis and A. Rambaut (1995) Comparative analysis by independent contrasts (CAIC): an Apple Macintosh application for analysing comparative data. Computer Applications in the Biosciences (CABIOS) 11: 247-251. Emília P. Martins (emartins@work.uoregon.edu), of the University of Oregon, has released version 4.0 of COMPARE, a package of programs for comparative methods analysis. COMPARE includes various programs for conducting statistical analyses of comparative data in a phylogenetic context. At the moment, it includes programs to compute independent contrasts, do spatial autocorrelation analyses, sum of squares parsimony, generate random
http://evolution.genetics.washington.edu/phylip/software.etc2.html (1 of 8) [14/11/2000 5:28:44 pm]
Phylogeny Programs (continued)
data, trees and/or branch lengths, and various other things. New programs will be added as they are ready. COMPARE is written in Java and is available both as standalone Java (including source code) and also as a Compare server. It requires a Java runtime environment. COMPARE is available from its web site at http://work.uoregon.edu/~COMPARE/. Earlier Windows95 and Sun Solaris executables and C source code of COMPARE 3.1 are available from the COMPARE 3.1 web site at http://work.uoregon.edu/~COMPARE/indexV3.html and and Windows95, Windows3.1, Sun Solaris, and HP/UX executables as well as C source code of COMPARE 2.0 are available from the COMPARE 2.0 web site at http://work.uoregon.edu/~ftp/download.html. Hang-Kwang Luh, John Gittleman, and Mark Kot of the University of Tennessee at Knoxville have made available PA, a package of Macintosh programs that implement the phylogenetic autocorrelation comparative method introduced by Gittleman and Kot ( Systematic Zoology , 1990). It is free and is available by anonymous ftp from ftp.math.utk.edu in directory pub/luh. Emilia Martins (emartins@work.uoregon.edu), of the Department of Biology of the University of Oregon, has written CMAP, the Comparative Method Analysis Package, for comparative methods analysis. This package was developed when she and Ted Garland were conducting the simulation study described in the paper: Martins, E. P. and T. Garland, Jr. 1991. Phylogenetic analyses of the correlated evolution of continuous characters: a simulation study. Evolution 45: 534-557. It can be used to estimate the correlation between two continuous characters measured in different species while taking phylogenetic information into account. Methods for doing so include several versions of Felsenstein's (1985) independent contrasts, and the sum-of-squared-changes parsimony algorithm. The programs in CMAP are described by Martins as "slow" and "unfriendly". The executables are available only for DOS machines. She is no longer developing this package, and is now concentrating her efforts on her other package COMPARE, which will soon be able to do everything that CMAP can. CMAP is available from its Web page at http://work.uoregon.edu/~emartins/programs/cmap.html or by anonymous ftp from evolution.uoregon.edu in directory CMAP.
Patrik Lindenfors (Patrik.Lindenfors@zoologi.su.se), of the Department of Zoology, Stockholm University, has written CoSta version 1.03, a DOS program which carries out the Contingent States Test for the correlation of changes in two characters along a tree, which is described in the paper: Sillén-Tullberg, B. 1993. The effect of biased inclusion of taxa on the correlation between discrete characters in phylogenetic trees. Evolution 47: 1182-1191. The program reads MacClade data files, and also text files saved from MacClade. The program can be fetched at its Web site at http://www.zoologi.su.se/personal/patrik/costa.html. Ted Garland, of the Department of Zoology of the University of Wisconsin (tgarland@facstaff.wisc.edu) and his colleagues (Jason A. Jones, Allan W. Dickermann, Peter E. Midford, and Ramon Diaz-Uriarte) have developed PDAP version 5.0, Phenotypic Diversity Analysis Programs, a series of DOS programs to perform various comparative analyses. At present, the following phylogenetically based statistical methods are included: independent contrasts, squared-change parsimony reconstructions of ancestral states and estimation of evolutionary correlations, and phylogenetic analysis of covariance via computer-simulated (Monte Carlo) null distributions. PDTREE can also read, write, and edit trees. PDAP is described in a web page at http://www.wisc.edu/zoology/faculty/fac/Gar/PDAP.html. The methods used are described in a number of recent papers by these authors, including: Diaz-Uriarte, R., and T. Garland, Jr. 1996. Testing hypotheses of correlated evolution using phylogenetically independent contrasts: sensitivity to deviations from Brownian motion. Systematic Biology 45: 27-47, and Purvis, A., and T. Garland, Jr. 1993. Polytomies in comparative analyses of continuous characters. Systematic Biology 42: 569-575. PDAP is described PDAP is distributed by email of a self-extracting executable file, obtainable for free (contact Garland by e-mail). Alternatively, a DOS disk can be mailed. David Ackerly (dackerly@leland.Stanford.EDU) of the Department of Biological Sciences, Stanford University, Stanford, California has released ACAP 2 (Another Comparative Analysis Program) to carry out independent contrasts methods for comparative analysis. It also also incorporates linear parsimony methods into the program, in order to calculate consistency indices for continuous characters. The program is written in Think Pascal for Macintosh systems, and is available from its web site at http://www.stanford.edu/~dackerly/ACAP.html as a Macintosh executable which will run on Macintosh or PowerMacintosh computers. Simon Blomberg, of the Department of Zoology and Entomology of the University of Queensland, St. Lucia, Australia (S.Blomberg@mailbox.uq.edu.au) has announced the beta-release of a small comparative method program, Fels-Rand version 0.91beta [look folks, don't blame me, I had nothing to do with naming this program]. It is designed to analyse data when the phylogeny is only poorly known, as when there is one or several polytomies. The program is said to be inspired by a 1994 paper in Systematic Biology by Jonathan Losos. It retains known tree topology and randomises the unknown parts of the tree, unlike some other programs, which randomize the whole tree. The statistics are calculated on independent contrasts from fully (randomly) resolved trees. Fels-rand is written in XLISP-STAT, and runs in the XLISP-STAT environment (in other words you first must get and install XLISP-STAT on your computer to run the code, which is written in the XLISP-STAT language). XLISP-STAT is available for Macintosh, Windows, and Unix. Fels-rand is available from Blomberg's home page at http://dingo.cc.uq.edu.au/~ansblomb/. Its README file can be found in a newgroup posting at http://life.biology.mcmaster.ca/~brian/evoldir/Other/ComparativeMethod.software.
Ehab Abouheif (abouheif@duke.edu) of the Department of Zoology, Duke University, Durham, North Carolina has written (together with J. Reeve) Phylogenetic Independence version 1.1. It carries out Abouheif's Test For Serial Independence (TFSI) on continuously valued characters and his Runs Test on discretely valued characters. These are described in his paper: Abouheif, E. 1999. A method to test the assumption of phylogenetic independence in comparative data. Evolutionary Ecology Research 1: 895-909. The program is available as a Windows95/98/NT executable, and its manual may be viewed as a web pade, at its web site at http://life.bio.sunysb.edu/ee/ehab. Dolph Schluter (schluter@zoology.ubc.ca) of the Department of Zoology of the University of British Columbia, in Vancouver, Canada, has released ANCML, a program which estimates ancestor states for a continuous trait, and provides a "standard error" for the marginal distribution of each estimate. The method is described in Schluter, D., T. Price, A. Ø. Mooers and D. Ludwig. 1998. Likelihood of ancestor states in adaptive radiation. Evolution 51: 1699-1711. The method assumes a Brownian motion model for the evolution of the trait. ANCML was written by modifying the program CONTRAST in PHYLIP version 3.5, and it uses similar input conventions. ANCML is available from its web page at http://www.zoology.ubc.ca/~schluter/ancml.html. It is available as generic C source code and as a SunOS executable or as a DOS executable.
http://evolution.genetics.washington.edu/phylip/software.etc2.html (2 of 8) [14/11/2000 5:28:44 pm]
Phylogeny Programs (continued)
Bill Bruno (billb@lanl.gov) of the Theoretical Biology and Biophysics Group T10, Los Alamos Scientific Laboratory, has produced RIND, (Reconstructed INDependence), a program which takes a tree supplied by the user, or uses a distance method of the users choosing (one which can be found in PHYLIP), and computes a maximum likelihood estimate of the number of times each residue in aligned protein sequences was replaced in each position. The method is described in: Bruno, W. J. 1996, Modeling residue usage in aligned protein sequences via maximum likelihood Molecular Biology and Evolution 13: 1368-1374. RIND is available as C source code for a Unix environment, and assumes that PHYLIP is also installed. Joaquin Dopazo of the R&D Department of TDI (TDI-EMBNet), Spain, (dopazo@samba.cnb.uam.es) has written a program ABLE (Analysis of Branch Length Errors) which implements the method described by Adell and Dopazo in J. Mol. Evol. 38: 305-309 (1994). This is a parametric bootstrap test of constancy in evolutionary rates. The idea of the test is to simulate a large number of a data sets under the model of rate constancy and then to examine the distribution of the branch lengths. After, a tree is reconstructed without the constraint of rate constancy it can be checked whether the observed branch length values fall within the expected distribution. The program is intended for use with the PHYLIP programs FITCH and KITSCH. It is available as a DOS executable from Dopazo's software web page at http://www.cnb.uam.es/~bioinfo/Software/Ximo/www1.html or by anonymous ftp at: ftp.cnb.uam.es in directory pub/cnb/molevol. Kent Fiala (72470.1407@compuserve.com) (most recently of SAS Institute) produced CLINCH (CLadistic INference by Compatibility of Hypothesized characters) version 6.2. It is a general-purpose compatibility program capable of handling multiple unordered states. It is available as a DOS executable, including FORTRAN source code, from the Digital Taxonomy web page at http://www.geocities.com/RainForest/Vines/8695/software.html#Cladistics.
Benjamin Salisbury (ben@aya.yale.edu), of the Department of Ecology and Evolutionary Biology, Yale University has released SECANT version 2.2, based on an earlier program, CLINCH, by Kent Fiala, now of SAS Institute. SECANT was previously known as CLINCH2. It is probably the most sophisticated compatibility analysis (clique analysis) program, capable of handling unordered multiple states. It can also group characters by Salisbury's own Strongest Evidence criterion. It and its criteria are described in a paper: Salisbury, B. A. 1999. Strongest evidence in compatibility: clique and tree evaluation using apparent phylogenetic signal. Taxon 48: 755-766. SECANT is available as a Windows95/NT executable, and its source code is described as available on request. It is available from its web page at http://jkim.eeb.yale.edu/salisbur/. Mark Wilkinson, of the Department of Zoology, The Natural History Museum, London, U.K. (marw@nhm.ac.uk) has written PICA95, a package of programs for character weighting and randomization tests for compatibility analysis for 0/1 or multistate characters. These carry out a variety of tests for nonrandomly compatible characters and include methods developed by Sharkey, Le Quesne, Meacham and Alroy. They include ability of process data that reflect the splits method of Bandelt and Dress. The programs are available as a package of DOS executables, from his software Web site at http://www.bio.bris.ac.uk/research/markwilk/software.htm. Christopher Meacham (Museum Informatics Project, University of California, Berkeley, California 94720, U.S.A.) produces COMPROB, a Pascal program to compute probabilities that characters would be compatible at random, thus telling us which clique is "most surprising". He can be contacted as meacham@violet.berkeley.edu about receiving a copy. The program is free. The program MARKOV computes a distance measure between pairs of nucleotide sequences. It also constructs phylogenies from these and summarizes the 4x4 substitution matrices between the pairs of species. It uses a more general model of substitution than used in PHYLIP, the Stationary Markov Model described in the paper by Saccone et. al. in Methods in Enzymology volume 183, pages 570-583, 1990. Bootstrapping is used to analyze the statistical error of the results. Output files from CLUSTAL and PILEUP, as well as some other formats, can be used for input, and analysis can be confined to certain codon positions in coding sequences. The program is written in FORTRAN and runs on VMS and Unix systems. It was produced by Dr. Graziano Pesole and Professor Cecilia Saccone at the University of Bari, Italy, and is available (for free?) from Dr. Cecilia Lanave at CSMME-CNR, Dipartimento di Biochimica e Biologia Molecolare, Universita` di Bari, via Orabona 4, 70126 Bari, Italy. Her phone number is 39-80-243305, her fax number is 39-80-243317, and her e-mail address is lanave@vaxba0.ba.it or mvx36@ibacsata.it J. S. Armstrong, A. J. Gibbs, R. Peakall and G. Weiller, (johna@rsbs-central.anu.edu.au) of Gibbs's group at the Research School of Biological Sciences of the Australian National University, Canberra, have produced RAPDistance version 1.04, a package for DOS or Windows systems for computing distance matrices for RAPD analyses. It has a comprehensive range of options for creating data files, editing them and using application programs to analyse them. RAPDistance is available free on the World Wide Web at http://life.anu.edu.au/molecular/software/rapd.html, or by anonymous ftp from directory pub/RAPDistance at life.anu.edu.au. P. R. Reeves and colleagues at Sydney University, Australia, have produced MULTICOMP, a program for computing various distances from sequence data. It is described in a paper by Reeves et. al. in CABIOS 10: 281-284 (1994). I do not know what computer systems it runs on. Reeves may be contacted at reeves@angis.su.oz.au for distribution information. Ken Rice (krice@saul.cis.upenn.edu) of the University of Pennsylvania (formerly of Harvard University) has produced RSVP (restriction site variability program) which calculates several measures of genetic variability based on restriction map data. It also produces Jukes-Cantor corrected distance matrices with standard errors from collections of restriction maps. C source code for Version 2.08 of RSVP is available free by anonymous ftp from: phylogeny.harvard.edu in directory pub/rice. It runs under Unix. Microsat, by Eric Minch (minch@crick.stanford.edu) is a program for calculating distances from microsatellite data. It uses the methods developed by David Goldstein et. al., and presented in their papers of 1995 in Proc.
http://evolution.genetics.washington.edu/phylip/software.etc2.html (3 of 8) [14/11/2000 5:28:44 pm]
Phylogeny Programs (continued)
Natl. Acad. Sci. USA 92: 6720-6727 and Genetics 139: 463-471. The distance is based on the mean microsatellite array size, implementing the "Delta mu" distance that they defined, which corrects for within-population variability and provides a distance that is independent of population size. It is available for free from a page in Luca Cavalli-Sforza's lab web site at http://human.stanford.edu/microsat/microsat.html. The program is written in ANSI C. Source code is distributed, and so are executables for DOS, PowerMac and Macintosh. Georg Weiller, of the Bioinformatics Laboratory, Australian National University, Canberra, Australia (weiller@rsbs-central.anu.edu.au) has produced DIPLOMO (DIstance PLOt MOnitor) version 1.03. It compares different distance measures with each other by displaying them as a scatter plot. It then helps one instantly identify all individual comparisons within the plot. individual taxa can be excluded or included in the plots, DIPLOMO enables you to see whether different taxa have different mutational characteristics (such as more having relatively more transitions in some lineages), and whether different distance measures correlate. The program takes as input a file with several different distance matrices. This file is in a simple format which can readily be produced by editing distance matrices produced by other packages. A program to compute the distance matrices is currently under development. Although DIPLOMO is intended to be ported to multiple platforms the current version runs on DOS on PC-compatibles. DIPLOMO is free; it can be obtained by World Wide Web from http://life.anu.edu.au/molecular/software/diplomo/, or by anonymous ftp from life.anu.edu.au in /pub/molecular_biology/software/diplomo. Floppy disk distribution is also possible. It is described in a publication: Weiller, G. F. and A. Gibbs. 1995. DIPLOMO: The tool for a new type of evolutionary analysis. CABIOS 11: 535-40. Joaquin Dopazo and J. M. Carazo ( jd19662@ggr.co.uk and carazo@embnet.cnb.uam.es) have produced SOTA, a package to carry out the Self Organizing Tree Algorithm. It is based on Kohonen's unsupervised neural network of self-organizing maps and on Fritzke's growthing cell structures algorithm to construct phylogenetic trees from biological molecular sequence data. It is described in a paper: Dopazo, J. and J. M. Carazo. 1997. Phylogenetic reconstruction using an unsupervised growing neural network that adopts the topology of a phylogenetic tree. Journal of Molecular Evolution 44a: 226-233. SOTA can use sequence data, distance matrix data, or dipeptide frequencies from proteins. SOTA is available as source code in C for Unix, as executables for SGI workstations, and also with a Windows program called Drawer that draws the resulting trees. The package with documentation is available by anonymous ftp from ftp.cnb.uam.es in directory pub/cnb/sota. MUST, a package of sequence management programs, is distributed on a shareware basis by Herve Phillippe, of the Laboratoire de Biologie Cellulaire (URA CNRS 1134 D), Batiment 444, Universite de Paris-Sud, 91405 Orsay cedex, France. His e-mail address is hp@bio4.bc4.u-psud.fr, and his phone and fax numbers are respectively 33.1.69.41.64.81 and 33.1.69.41.21.30. MUST is available for free if you send 5 1.44-Mb diskettes, or on a shareware basis (with $100 registration fee) if you do not send diskettes. It runs on DOS systems using DOS version 3 or later. It is intended as complementary to existing phylogeny and alignment programs and can produce output files in the formats of PHYLIP, PAUP*, Hennig86, and CLUSTAL. It contains a variety of sequence input, editing, checking, and storage functions, as well as a sequence editor and a phylogeny plotter. It also allows further analyses of the results from these phylogeny programs. It is not yet available by ftp. Steve Smith, formerly of the Harvard Genome Laboratory, has written an X-Windows interactive sequence editor, GDE (Genetic Data Environment), version 2.2, which allows the user to edit sequences and align them by hand, and to select subsets of sites and sequences and call a variety of analysis proprams including ClustalV and many of the PHYLIP 3.5 programs. The GDE 2.2 system will run on many workstations that have the X windowing system. It also includes the TreeTool tree-plotting program (see below). GDE 2.0 is free and is available for at the molecular biology software servers, a web page is at http://ftp.bio.indiana.edu/soft/molbio/unix/GDE/ and by anonymous ftp from megasun.bch.umontreal.ca in directory pub/gde. At the latter location there are also Linux binaries and Sun binaries.
Don Gilbert (gilbertd@bio.indiana.edu) of the Department of Biology of the University of Indiana, has written SeqPup version 0.7,, a biological sequence editor and analysis program usable on Macintosh, MS-Windows and X-Windows systems. It includes links to network services and external analysis programs. It includes phylogenetic analysis of alignments with the fastDNAml and LSADT programs. It can be obtained by anonymous ftp from iubio.bio.indiana.edu, in directory molbio/seqpup, or by World Wide Web at http://iubio.bio.indiana.edu/soft/molbio/seqpup. The most recent version of SeqPup is written in the Java language, and does not exist as an executable, since you can run the Java code directly if you have the Java 1.1 system (which is available on Windows, Macintoshes, and X Windows Unix systems). An earlier version (0.5) is available in C++, and executables of that are available for Macintosh (PowerMac and 68K), MS Windows (Win95, WinNT and Win3), and Unix/XWindows systems including Sun Solaris, SGI Irix, DEC Unix, Linux. It is currently the more complete version, since it can also run some PHYLIP programs. The C++ source code for that version available by anonymous ftp at: iubio.bio.indiana.edu in directory util/dclap/source/. Wolfgang Ludwig and Oliver Strunk of the Lehrstuhl für Mikrobiologie of the Technische Universität München (wolfgang.ludwig@biol.chemie.tu-muenchen.de) distribute ARB, an environment for 16s/18s/23s ribosomal RNA sequence data. It provides a windowing environment for building up databases of RNA sequences, aligning them, and searching, editing, modifying, aligning, profiling, and constructing trees. ARB uses its own RNA sequence databases which are made available to ARB over the Web. For phylogenies it uses programs from PHYLIP and fastDNAml, as well as its own ARB Neighbor-Joining program. ARB is also incorporates a variety of other sequence analysis software. It can handle large numbers of sequences and has sophisticated tree drawing and manipulation. ARB is distributed as executables for a variety of versions of Unix, requiring that Motif be available. At the moment these are: SUN OS 4.1.x, SUN Solaris >2.4, Silicon Graphics >5.0, Linux for PC, and Digital OSF. ARB is available from its web site at http://www.mikro.biologie.tu-muenchen.de/pub/ARB/ or by ftp from ftp.mikro.biologie.tu-muenchen.de.
Tom Hall of the Department of Microbiology at North Carolina State University (tahall2@unity.ncsu.edu) has produced BioEdit, version 4.8.4. This is a sequence editor with many kinds of general molecular biology functions available (alignment, BLAST searches, plasmid drawing, restriction mapping, sequence machine trace viewing, etc.). For our purposes the feature worth mentioning is that it comes with a number of existing phylogeny programs which can be automatically run from within BioEdit. These are: Treeview, fastDNAml, and six DNA and protein programs from PHYLIP. BioEdit is available as Windows95/98/NT executables from its web site at http://www.mbio.ncsu.edu/RNaseP/info/programs/BIOEDIT/bioedit.html.
http://evolution.genetics.washington.edu/phylip/software.etc2.html (4 of 8) [14/11/2000 5:28:44 pm]
Phylogeny Programs (continued)
Louxin Zhang (lxzhang@krdl.org.sg) of the The Internet Bioinformatics Group of the Internet Research and Development Unit of the National University of Singapore has produced a web interface for the PHYLIP package, which can submit jobs to it. It can be obtained by e-mailing him at the above address. The interface can be used at his site as a server.
Andrew Rambaut of the Department of Zoology, University of Oxford, (andrew.rambaut@zoo.ox.ac.uk) has written Bi-De version 0.1, to simulate the evolution of trees using various models of lineage birth and death, and sampling lineages from among those extant. It can simulate branching with or without regulation of the number of lineages. It also allows the user to specify the relationship between the number of lineages and the birth rate of lineages. The program is available free for Macintoshes with system 7.0 or later, from the University of Oxford Zoology Web server at http://evolve.zoo.ox.ac.uk/Bi-De/Bi-De.html or by ftp from evolve.zoo.ox.ac.uk in directory packages as file Bi-De01b.hqx
Andrew Rambaut of the Department of Zoology, University of Oxford, (andrew.rambaut@zoo.ox.ac.uk) has written End-Epi (Endemic-Epidemic) version 1.0, a program to examine trees to assess relative cladogenesis (whether there is evidence that one clade has speciated more than another), and make lineages-through-time plots, with the objective of discovering whether the rate of speciation has been constant through time ("endemic") or has been higher initially ("epidemic"). The program is available free for Macintoshes with system 7.0 or later, from the its Web page at http://evolve.zoo.ox.ac.uk/End-Epi/End-Epi.html or by ftp from evolve.zoo.ox.ac.uk in directory packages as file End-Epi10.hqx. However the Macintosh executable will not work with operating systems later than version 7.5, unless you switch off Modern Memory Manager in the control panel first.
Paul-Michael Agapow of the Department of Biology, Imperial College, Silwood Park, U.K. (p.agapow@ic.ac.uk) has released MacroCAIC, which was developed from CAIC, by Andy Purvis and Andrew Rambaut. MacroCAIC uses phylogenies and data sets of character values to examine correlates of species richness in the phylogeny. MacroCAIC is a PowerMac and Mac binary executable. It is available from its web site at http://evolve.bio.ic.ac.uk/software/macrocaic/index.html.
Nick Grassly, currently of the Zoologisches Institut, Universität München (grassly@zi.biologie.uni-muenchen.de), has written SEQEVOLVE, a program that takes standard (Newick) formatted treefiles and evolves sequences along them following a stochastic process with the expected number and type of substitutions calculated according to a model of molecular evolution. A variety of nucleotide substitution models are implemented: Jukes and Cantor (1969), Kimura (1980), Felsenstein (1981), Hasegawa et al, (1985), and the DNAML model from PHYLIP (Felsenstein, 1995). a PowerMacintosh and Macintosh executable is available, as well as source code files for Unix systems. SEQEVOLVE does not allow for rate heterogeneity among sites or among codon positions as his more recent program Seq-Gen does. SEQEVOLVE is available by ftp from evolve.zoo.ox.ac.uk in directory packages/grassly/Seqevolve as files seqevolve-mac.hqx or seqevolve.tar.Z. John Huelsenbeck (johnh@brahms.biology.rochester.edu) of the Department of Biology of the University of Rochester has written TheSiminator, a program that simulates the evolution of nucleotide sequences along a given tree or trees. It allows for gamma-distributed rate variation among sites, and the Hasegawa-Kishino-Yano 1985 model of nucleotide substitution. It is distributed as C source code and as a Macintosh executable, with examples of input files. It can be fetched from the Slatkin Lab's software Web page at http://ib.berkeley.edu//labs/slatkin/software.html.
Andrew Rambaut of the Department of Zoology, University of Oxford, (andrew.rambaut@zoo.ox.ac.uk) and Nick Grassly, currently of the Zoologisches Institut, Universität München ( grasslyzi.biologie.uni-muenchen.de), have written Seq-Gen (Sequence Generator), version 1.1, a program that will simulate the evolution of nucleotide sequences along a phylogeny or multiple phylogenies, using common models of the substitution process. A range of models of molecular evolution are implemented including the general reversible model. Nucleotide frequencies and other parameters of the model may be given and site-specific rate heterogeneity may also be incorporated in a number of ways. The models available are the Hasegawa, Kishino and Yano (HKY) model, the Felsenstein F84 model, the general reversible model, the Kimura 2-parameter model and the Jukes-Cantor model. Rate heterogeneity among sites or among the different positions within a codon can be specified. A PowerMacintosh and Macintosh executable is available, as well as source code files for Unix systems. It is available from its Web page at http://evolve.zoo.ox.ac.uk/Seq-Gen/Seq-Gen.html or by ftp from evolve.zoo.ox.ac.uk in directory packages/Seq-Gen as files Seq-Gen10.hqx or Seq-Gen10.tar.Z.
Nick Grassly (currently of the Zoologisches Institut, Universität München) and Andrew Rambaut of the Department of Zoology, University of Oxford, (grassly@zi.biologie.uni-muenchen.de and andrew.rambaut@zoo.ox.ac.uk) have written PSeq-Gen (Protein-Sequence Generator), version 1.0, which will simulate the evolution of protein sequences along evolutionary trees. Three common models of amino acid substitution are implemented (PAM, JTT, and mREV), allow for user-defined amino acid frequencies. Site-specific rate heterogeneity following a gamma distribution is allowed. The program can handle multiple trees and produce multiple data sets. PSeq-Gen is available from its Web site at http://evolve.zoo.ox.ac.uk/PSeq-Gen/PSeq-Gen.html as Unix source code and also as PowerMac executables. An online manual can also be viewed at that site.
http://evolution.genetics.washington.edu/phylip/software.etc2.html (5 of 8) [14/11/2000 5:28:44 pm]
Phylogeny Programs (continued)
Nick Grassly (grassly@zi.biologie.uni-muenchen.de of the Zoologisches Institut, Universität München and Andrew Rambaut, of the Department of Zoology, University of Oxford have released Treevolve, version 1.32 and also Ptreevolve, programs that simulate the evolution of DNA and protein sequences respectively. The molecular sequences are simulated under coalescent models with constant population size, or with exponential population size growth. In addition different levels of recombination can be specified. In Treevolve, it is also possible to have an island model of population subdivision. Treevolve and Ptreevolve are written in ANSI C and should compile on most UNIX systems and workstations. They will also compile using Metrowerks Codewarrior on the Apple Macintosh; a project file and compiled 'fat' executable are included in the Macintosh archive. They can be obtained, and the manual of the programs viewed, from their Web site http://evolve.zoo.ox.ac.uk/Treevolve/treevolve.html. They can also be obtained by anonymous ftp from evolve.zoo.ox.ac.uk in directories packages/Treevolve or packages/Ptreevolve as compressed tar archives (for the Unix source code version) or Binhexed archives for the Macintosh executables and sources. Jens Stoye1, Dirk Evers and Folker Meyer of the Research Center for Interdisciplinary Studies on Structure Formation (FSPM) and the Technische Fakultat of the Univeristy of Bielefeld, Germany (j.stoye@dkfz-heidelberg.de, dirk@TechFak.Uni-Bielefeld.de, and folker@TechFak.Uni-Bielefeld.de) have released ROSE, the Random model Of Sequence Evolution, version 1.0.1. It simulates the evolution of DNA, RNA, or protein sequences on a randomly generated tree, allowing for the possibility of insertions and deletions as well. It can also use a predefined tree that is input in standard format. It can report ancestral sequences or sequences at the tips of the tree, and it also keeps a record of the true multiple sequence alignment for comparison with the results of multiple sequence alignment programs. ROSE is described in the paper: Stoye, J. D. Evers and F. Meyer. 1998. Rose: generating sequence families. Bioinformatics 14: 157-163. ROSE is available in source code at its web site at http://bibiserv.TechFak.Uni-Bielefeld.DE/cgi-bin/bibi_download?tool=rose. Version 1.0 is available as binary executables for SunOS and for SGI Unix by anonymous ftp from ftp.Uni-Bielefeld.de in directory pub/projects/techfak/pi/rose/. ROSE is also available as a server.
Dmitry Filatov, of the Institute of Cell, Animal, and Population Biology of the University of Edinburgh (Dmitry.Filatov@ed.ac.uk) has released ProSeq (PROcessor of SEQuences) version 2.4. ProSeq is a sequence-editing environment that can do sequence alignment editing, translation, detection of polymorphic sites, and a variety of tests, many of a population-genetic nature, for neutrality and recombination. The part of its capabilities that are relevant to this listing is that it can simulate the evolution of a set of DNA sequences along a coalescent tree, with or without recombination. ProSeq is a Windows 95/98/NT program available from its web site at http://helios.bto.ed.ac.uk/evolgen/filatov/proseq.html.
John Huelsenbeck (johnh@brahms.biology.rochester.edu) of the Department of Biology of the University of Rochester has written StratCon, a program to test the consistency of a tree with stratigraphy of the species. It uses a permutation test described in the paper Huelsenbeck, J. 1994. Measuring and testing the fit of the stratigraphic record to phylogenetic trees. Paleobiology 20: 470-483. The program is available as a Macintosh executable. It can be fetched from the Slatkin Lab's software Web page at http://ib.berkeley.edu//labs/slatkin/software.html.
Andrew Rambaut (andrew.rambaut@zoo.ox.ac.uk) of the Department of Zoology, University of Oxford, has released QDate version 1.1. QDate estimates the date of divergence between two pairs of sequences given that the date of divergence of the members of each pairs is known. It analyzes the data under three models: (1) a perfectly clocklike model, (2) a model in which one pair has a different rate of divergence than the other, and (3) a model in which all branches have different rates. The method is described in the paper: Rambaut, A., and L. Bromham. 1998. Estimating divergence dates from molecular sequences. Molecular Biology and Evolution 15: 442-. QDate is available from its web site at http://evolve.zoo.ox.ac.uk/QDate/QDate.html. It is available as C source code for Unix or as a Macintosh executable.
Andrew Rambaut, of the Department of Zoology, University of Oxford (andrew.rambaut@zoo.ox.ac.uk) has written TipDate version 1.01. TipDate is an application for estimating the rate molecular evolution (and hence a time-scale) for a phylogeny consisting of dated tips. These will most frequently be from viruses or other fast-evolving pathogens that have been isolated over a range of dates. The program can also return the likelihood for the simple molecular clock model (i.e., assuming that all sequences are contemporary) or the non-clock model. These are useful for likelihood ratio tests of the fit of the model to the data. TipDate is available as PowerMac or Windows executables and as source code for Unix from its web site at http://evolve.zoo.ox.ac.uk/TipDate/TipDate.html. Emmanuel Paradis, now of the Department of Biological Sciences of the University of East Anglia (E.Paradis@uea.ac.uk) released, when he was at the Institut des Sciences de l'Evolution de Montpellier, DIVERSI, a program for the analysis of diversification using phylogenetic data. It uses several methods to estimate and test for variations in diversification rates using phylogenetic data, including tests for temporal or among-clade variations in diversification rates using a maximum likelihood method. The program takes divergence times as its input. The tests are described in a paper: Paradis, E. 1997. Assessing temporal variations in diversification rates from phylogenies: estimation and hypothesis testing. Proceedings of the Royal Society of London B 264: 1141-1147. It is available as FORTRAN source code and also as a DOS executable, by ftp from evol.isem.univ-montp2.fr in directory /pub/pc/Log-manu. Marc Robinson-Rechavi, of the Laboratoire de Biologie Moléculaire et Cellulaire of the École Normale Supérieure de Lyon, France (marc.robinson@ens-lyon.fr) has written RRTree, (Relative Rate tests within a Tree), version 1.1. It carries out relative rate tests for equality of evolutionary rates in DNA or protein sequences between lineages, taking into account the structure of the tree, which can be input in a number of common formats. In addition sequences are read in. The methods are described in a paper: Robinson, M., M. Gouy, C. Gautier, and D. Mouchiroud. 1998. Sensitivity of the relative-rate test to taxonomic sampling, Molecular Biology and Evolution 15: 1091-1098. RRtree is available as C source code and as executables for Windows, PowerMac, and SGI or Solaris Unix systems from its web page at http://pbil.univ-lyon1.fr/software/rrtree.html or by anonymous ftp from ftp://pbil.univ-lyon1.fr in directory /pub/mol_phylogeny/rrtree/
http://evolution.genetics.washington.edu/phylip/software.etc2.html (6 of 8) [14/11/2000 5:28:44 pm]
Phylogeny Programs (continued)
Mike Maciukenas, at the Department of Microbiology of the University of Illinois, has written a wonderful X-windows based interactive tree-plotting program called TreeTool. It takes as input a PHYLIP tree file, with branch lengths if they are provided, displays the tree in either rooted or unrooted form on any X-windows screen, and allows the user to modify the form of the tree and the placement of nodes and labels. When the tree is in final form the user can have it written to a Postscript file and/or printed to a Postscript- compatible printer. TreeTool is free as a C program for X windows using the Xview library. However, Xview seems to be available mostly on Sun workstations. TreeTool is available by Web from http://www.cme.msu.edu/RDP/cgis/aftpdir_show.cgi?ftpdir=pub/RDP/programs/TreeTool&title=Phylogenetic%20tree%20editor%20(TreeTool;%20Unix;%20C)&showdir=yes You can ftp the current treetool version 2.0.2 with source from rdp.life.uiuc.edu, from directory pub/RDP/programs/TreeTool. It will also be found in directory molbio/unix/treetool at iubio.bio.indiana.edu. It is also included in the GDE 2.0 sequence analysis environment mentioned above. A Debian Linux package for Intel-compatible processors is available at its web page at http://www.debian.org/Packages/unstable/x11/treetool.html. It has links to the Debian package for Xview.
Rod Page of the University of Glasgow, Scotland (dpage@udcf.gla.ac.uk), has written TreeView, a program for displaying trees on Apple Macs and Windows PCs. It can draw rooted and unrooted trees, display bootstrap values, and supports the native font and graphics file formats of both Macs and PCs. The program reads NEXUS, PHYLIP, and Hennig86 style tree files (including files produced by fastDNAml and CLUSTALW), and can save trees in the same formats so that it can convert trees among these formats. The Mac and Windows versions have almost identical interfaces. They can support the standard TrueType and Postscript fonts available on Macs and PCs, and they support the standard PICT and Windows Metafile formats for output, allowing tree pictures to be copied into other applications, as well as being saved in files. There is a MacClade/COMPONENT-style interactive tree editor and tree descriptions can be pasted directly into the program. There is a print preview and drag-and-drop facilities. Currently (version 1.5) TreeView can read up to 100 trees with up to 500 taxa. The program is free, and can be obtained by World Wide Web from http://taxonomy.zoology.gla.ac.uk/rod/treeview.html. It comes in 68K Mac, PowerMac, and Windows 95/NT executable versions (and in a Windows 3.1 executable for version 1.4). There is also online help including an online manual.
Manolo Gouy of the University of Lyon, France (mgouy@biomserv.univ-lyon1.fr), has produced NJplot, which plots rooted phylogenies (input in the standard form) and saves the plots as Postscript (for Macintosh, PICT) files. It displays branch lengths and bootstrap information (if present) and allows the user to swap branches and change the position of the root. It is described in the paper Perrière, G. and M. Gouy 1996. WWW-Query: an on-line retrieval system for biological sequence banks. Biochimie 78: 364-369. It is available free as executables for Macintosh, Windows3.1, Windows95, SunOS, SGI, IBM Unix, Linux and DEC Alpha, and as source code in C. It can be retrieved using its web page at http://pbil.univ-lyon1.fr/software/njplot.html. A Debian Linux package is available from its web page at http://www.debian.org/Packages/unstable/x11/njplot.html. It has links to a number of other Debian packages that are needed to run it, including Lesstif and the NCBI Vibrant toolkit.
Manolo Gouy of the University of Lyon, France (mgouy@biomserv.univ-lyon1.fr), has written unrooted, which draws unrooted phylogenies and saves them in Postscript files (for Macintosh, PICT files). It is available free as executables for Macintosh, Windows95, SunOS, Sun Solaris, SGI, IBM Unix, Linux and DEC Alpha, and as source code in C. It can be retrieved using its web page at http://pbil.univ-lyon1.fr/software/unrooted.html (the Windows95 and Linux executables are not listed on the web page but may be accessed through the ftp link on that page, which accesses the ftp server at pbil.univ-lyon1.fr in directory pub/mol_phylogeny/njplot/.
Tadashi Imanishi (timanish@genes.nig.ac.jp) of the National Institute of Genetics. Mishima, Japan has produced DendroMaker, version 4.1, Macintosh or PowerMacintosh programs which can draw trees on the screen and print them in Postscript files or MacPaint files. They read the trees in from tree files produced by the Neighbor-Joining or UPGMA options of the Oden package, and can also read standard Newick-format tree files. They can them edit them and reroot them in a variety of ways. They can produce PICT files of the trees. DendroMaker is distributed as several Macintosh binaries or PowerMacintosh binaries (each available in both English and Japanese versions) from its web page at http://www.cib.nig.ac.jp/dda/timanish/dendromaker/home.html. and by anonymous ftp from ftp.nig.ac.jp in directory pub/mac/bio/dendromaker.
Don Gilbert (gilbertd@bio.indiana.edu) of the Department of Biology of the University of Indiana, has written Tree Draw Deck, a Hypercard deck that draws trees on the screen from standard Newick-format tree files. The deck, which runs on Macintoshes or PowerMacs that have Hypercard Player, is based on two programs, Drawtree and Drawgram, from the 3.3 version of PHYLIP. It allows mouse-driven interactive display of the trees with selection of options for display, and allows the resulting plot to to cut into the system clipboard and pasted into drawing programs such as MacDraw or Canvas. Its printing and file-saving options do not at present work, but the clipboard method works. It is available by anonymous ftp from ftp.bio.indiana.edu in directory molbio, from ftp.ebi.ac.ukin directory pub/software/mac, and by ftp from ftp.pasteur.fr in directory pub/GenSoft/Macintosh/evolution/. A Macintosh version is available from a web page at http://yeamob.pci.chemie.uni-tuebingen.de/AAA/AAA-Tree.html by Kai-Uwe Frölich of Tübingen, Germany.
Don Gilbert (gilbertd@bio.indiana.edu) of the Department of Biology of the University of Indiana, has written Phylodendron version 0.8d, a Java application for drawing phylogenetic trees. It will read tree data in standard Newick format, then display graphical views of the phylogenetic tree. Various F options allow you to modify, adorn and edit the tree. Standard application functions to save, print, edit and manage preferences are included. This program will not estimate nor produce the tree data. Phylodendron is written as a Java application. This means that it will run on most personal computers and workstations as a standard program. This application is an enhancement of the Mac Hypercard program Tree Draw Deck released by the author in 1990, and uses tree drawing algorithms from PHYLIP. Phylodendron is available from its Web site
http://evolution.genetics.washington.edu/phylip/software.etc2.html (7 of 8) [14/11/2000 5:28:44 pm]
Phylogeny Programs (continued)
http://iubio.bio.indiana.edu/soft/molbio/java/apps/trees/ or by anonymous ftp from iubio.bio.indiana.edu in directory molbio/java/apps/trees/. Source code is also available there. Rick Ree of the Department of Organismic and Evolutionary Biology, Harvard University (rree@oeb.harvard.edu) has written Mavric, a package in the Python language for the manipulation and visualization of phylogenetic data. It is intended to be flexible and easily customizable to suit the needs of phylogenetic biologists who use python. Mavric is named for its core application, which aims to provide a graphical interface to phylogenies in a manner similar to MacClade. (The name Mavric is from is central application: a tool to Manipulate And Visualize RIck's Cladograms). Currently, its main usefulness is to view and manipulate phylogenetic trees. Branches can be moved around, pruned, rotated, etc. Mavric is available as Python source code from its web site at http://www.bioinformatics.org/mavric/index.html.
Andrew Rambaut (andrew.rambaut@zoo.ox.ac.uk) and Mike Charleston, both of the Department of Zoology, University of Oxford have produced the TreeEdit Phylogenetic Tree Editor version 1.0 alpha3-47. TreeEdit is an application for organizing, manipulating and viewing sets of trees. It is intended as a tool for preparing sets of trees for use in phylogeny packages. It can read and write trees in standard formats (PHYLIP, NEXUS, and CAIC). It allows drag-and-drop editing of branches, cut and paste of trees between windows (as NEXUS text), rerooting by clicking on branches, editing species labels, rotating branch order at polytomies, including and excluding species, and many other interactive features. TreeEdit is available as a PowerMac executable for MacOS System 8 or later at its web site at http://evolve.zoo.ox.ac.uk/software/TreeEdit/main.html. Dmitri Yu. Sherbakov of the Laboratory of Molecular Systematics, Limnological Institute, Russian Academy of Sciences, Irkutsk (dysh@sherb.lin.irk.ru) has written UO (User Option), a tree-making utility for Linux and similar Unixes which allows the user to write Newick standard treefiles manually. It gets species names from a sequence file in Sequential PHYLIP format with up tp 150 sequences, then allows you to build multiple trees by clicking on species names. It allows multifurcations. UO is distributed as C sources and Linux binaries from its web page at http://sherb.lin.irk.ru/uo.html. It requires X windows and the XForms library. Steven Brewer (Steven.Brewer@wmich.edu) and Robert Hafner of Western Michigan University have developed Phylogenetic Investigator, a teaching program that allows students to connect together organisms from a data set provided to them, to make phylogenies and examine them. The program is written as a Supercard 2.0 stack for Macintosh and PowerMacintosh systems; it is also available as a 680x0 Macintosh or PowerMac standalone executable. Version 1.6 was the last freeware version. A more recent version of Phylogenetic Investigator appears in the yearly BioQUEST Library CDROM from the BioQUEST consortium, (whose web page is at http://bioquest.org), a nonprofit publisher of interesting biological teaching software. It is listed on one ofthe BioQuest web pages (at http://www.apnet.com/bioquest/review.htm) as one of the new programs that have received a favorable initial review and are being released for use, and are candidates for inclusion in the core of the BioQuest teaching programs. It is not free; the cost of the CDROM with an individual license for one year is $99 but for that you get many different teaching programs. Site licenses for the BIOQUEST Library are $650 for single academic departments or institutions and $350 for secondary schools and secondary school districts. To next section of software page
http://evolution.genetics.washington.edu/phylip/software.etc2.html (8 of 8) [14/11/2000 5:28:44 pm]
Phylogeny programs by method and system
Phylogeny programs cross-referenced by method and system
This table lists packages by method (down the side of the table) cross-referenced with systems on which the programs work (across the top). I have used the descriptions provided by the authors of the packages where possible, but errors and omissions are entirely possible. If you see any, please e-mail me (joe@genetics.washington.edu) about them. Some packages are distributed only as generic source code (such as ANSI C source code). I have judged that they will be compilable mostly on Unix systems, unless specific support is provided by the author for compilers on PC's, Macs or VMS systems. So I have listed them under "Unix" only. If you feel comfortable running the compiler on a non-Unix system you might also want to look into the Unix listings to see which packages are available as generic source code. (Note, though, that there are now implementations of the Gnu C++ compiler on Windows systems such as CygWin and Mingw32 that enable generic C source code to be compiled there using only Makefiles). Programs in the Java language are listed here under Unix, Windows, and Macintosh, as Java is widely available on these systems. Programs requiring interpreted languages such as Tcl/Tk and Python are listed under Unix but may be runnable on other systems if you have the language loaded.
System on which they work
Unix (or generic source code only) DOS (and in a Windows Windows "DOS box") Mac or PowerMac VMS or OpenVMS
Method
PHYLIP PAUP* Fitch programs General-purpose Phylo_win ARB PAL PHYLIP PAUP* MEGA VOSTORG Fitch programs PHYLIP PAUP* DAMBE PAL PHYLIP PAUP* PAL PHYLIP PAUP* Phylo_win
http://evolution.genetics.washington.edu/phylip/software.xref.html (1 of 7) [14/11/2000 5:28:52 pm]
Phylogeny programs by method and system
Parsimony
PHYLIP PAUP* Fitch programs gmaes Phylo_win sog LVB TAAR ARB MALIGN POY
PHYLIP PAUP* Hennig86 MEGA RA Nona TurboTree Freqpars Fitch programs MALIGN POY Gambit
PHYLIP PAUP* Tree Gardener GeneTree DAMBE LVB Nona DNASEP SEPAL
PHYLIP PAUP* CAFCA GeneTree
PHYLIP PAUP* Phylo_win
PHYLIP PAUP* ODEN Fitch programs GCG SeqPup Lintre njbafd gmaes TreePack Phylo_win Distance matrix BIONJ ARB BIOSYS-2 Darwin sendbs nneighbor weighbor QR2 minspnet PAL Arlequin vCEBL
PHYLIP PAUP* MEGA Fitch programs ABLE TREECON DISPAN RESTSITE NTSYSpc METREE Hadtree PHYLTEST njbafd BIONJ Lintre BIOSYS-2 T-REX sendbs weighbor Gambit
PHYLIP PAUP* TREECON GDA SeqPup WET Molecular Analyst BIONJ TFPGA DAMBE DNASIS PAL Arlequin vCEBL
PHYLIP PAUP* MacT TreeTree SeqPup Molecular Analyst T-REX weighbor PAL Arlequin vCEBL
PHYLIP PAUP* GCG Phylo_win
http://evolution.genetics.washington.edu/phylip/software.xref.html (2 of 7) [14/11/2000 5:28:52 pm]
Phylogeny programs by method and system
Compute distances
Maximum likelihood
PHYLIP PAUP* MARKOV RSVP Microsat OSA TREE-PUZZLE GCG AMP GCUA DERANGE2 BIOSYS-2 RAPD-PCR DISTANCE Darwin sendbs PAML puzzleboot PAL Arlequin Unix (or generic source code only) PHYLIP PAUP* fastDNAml MOLPHY PAML SplitsTree PLATO SPOT TREE-PUZZLE SeqPup Phylo_win ARB Darwin BAMBE TreeCons VeryfastDNAml PAL
PHYLIP PAUP* Microsat DIPLOMO DISPAN RESTSITE NTSYSpc TREE-PUZZLE Hadtree REAP MVSP BIOSYS-2 RAPD-PCR sendbs K2WuLi
PHYLIP PAUP* TREECON GDA WET TFPGA MVSP RSTCALC Genetix Arlequin DAMBE DnaSP PAML PAL
PHYLIP PAUP* RAPDistance Microsat TREE-PUZZLE GCUA POPGENE GeneStrut PAML MATRIX PAL Arlequin
PHYLIP PAUP* MARKOV Microsat TREE-PUZZLE GCG
DOS (or in a Windows "DOS box")
Windows
Mac or PowerMac
VMS or OpenVMS
PHYLIP PAUP* PHYLIP MOLPHY PAUP* SeqPup SPOT Spectrum TREE-PUZZLE DAMBE Hadtree PAML PAL Modeltest
PHYLIP PAUP* fastDNAml PAML Spectrum SplitsTree PLATO SPOT TREE-PUZZLE SeqPup Modeltest PAL
PHYLIP PAUP* fastDNAml TREE-PUZZLE Phylo_win
http://evolution.genetics.washington.edu/phylip/software.xref.html (3 of 7) [14/11/2000 5:28:52 pm]
Phylogeny programs by method and system
Quartets methods
dnarates TREE-PUZZLE STATGEOM SplitsTree Darwin PhyloQuart Willson quartets programs SOTA PHYLIP PAUP* PHYLIP ARB UO PLATO Bootscanning Package TOPAL reticulate RecPars partimatrix LARD PHYLIP PAUP* PARBOOT OSA Lintre sog njbafd BIOSYS-2 RAPD-PCR TreeCons BAMBE puzzleboot
TREE-PUZZLE PHYLTEST GEOMETRY PICA95 Gambit
TREE-PUZZLE SplitsTree TREE-PUZZLE Willson quartets programs
Artificial Intelligence Invariants Tree rearrangement
PHYLIP PAUP* PHYLIP PDAP
PHYLIP PAUP* PHYLIP WINCLADA
PHYLIP PAUP* MacClade PHYLIP TreeEdit
PHYLIP PAUP* PHYLIP
Recombination
homoplasy test Network
PLATO LARD
Bootstrapping and other measures of support
PHYLIP PAUP* ABLE Random Cladistics DISPAN PHYLTEST njbafd PICA95 TAXEQ2 BIOSYS-2 RAPD-PCR Gambit MEAWILK
PHYLIP PAUP* DAMBE DNASEP SEPAL
PHYLIP PAUP* AutoDecay TreeRot PHYLIP RASA PAUP* DNA Stacks TreeTree CodonBootstrap
http://evolution.genetics.washington.edu/phylip/software.xref.html (4 of 7) [14/11/2000 5:28:52 pm]
Phylogeny programs by method and system
Compatibility
COMPROB PHYLIP reticulate partimatrix CLINCH
PHYLIP PICA95 CLINCH MEAWILK NTSYSpc PHYLIP PAUP* REDCON TAXEQ2 TreeCons QUARTET2
PHYLIP SECANT
PHYLIP
PHYLIP
Consensus trees PHYLIP and distances PAUP* between trees
PHYLIP COMPONENT PAUP* TREEMAP TREEMAP PHYLIP PHYLIP COMPONENT PAUP* RadCon
Tree-based Alignment
TreeAlign ClustalW MALIGN TAAR Ctree POY Unix (or generic source code only)
ClustalW MALIGN POY
GeneDoc Ctree DAMBE DNASIS
ClustalW ALIGN
ClustalW TreeAlign GCG
DOS (or in a Windows "DOS box")
Windows
Mac or PowerMac
VMS or OpenVMS
Biogeographic or host-parasite PHYLIP COMPARE ANCML RIND Fels-Rand PHYLIP COMPARE CMAP CoSta PDAP ACAP ANCML
Comparative method
COMPONENT TREEMAP TREEMAP COMPONENT PHYLIP MacClade PHYLIP CAIC Fels-Rand PA PHYLIP Phylogenetic MacroCAIC Independence Fels-Rand COMPARE
http://evolution.genetics.washington.edu/phylip/software.xref.html (5 of 7) [14/11/2000 5:28:52 pm]
Phylogeny programs by method and system
Simulation
Bi-De SEQEVOLVE TheSiminator Seq-Gen P/Treevolve PSeq-Gen COMPARE ROSE PAML PAL COMPARE
COMPARE
Bi-De SEQEVOLVE TheSiminator COMPONENT Seq-Gen COMPARE P/Treevolve PAML PSeq-Gen ProSeq COMPONENT PAL PAML ProSeq COMPARE End-Epi MacroCAIC StratCon QDate Modeltest PAML TipDate RRTree vCEBL CONSERVE PHYLIP PHYLIP PAUP* PAUP* Tree Draw Deck TreeView NJplot PHYLIP Phylodendron TreeView PAUP* NJplot DendroMaker unrooted Phylodendron DAMBE unrooted TREECON Tree Gardener DNA Stacks SeqPup SeqPup BioEdit Singapore Singapore PHYLIP PHYLIP web interface web interface Phylogenetic Investigator
Shapes of trees QDate DIVERSI PAML TipDate RRTree vCEBL
Clocks, dating, and stratigraphy
DIVERSI K2WuLi
PAML RRTree TipDate vCEBL
Prediction of data PHYLIP PAUP* TreeTool Fitch programs Phylodendron ARB NJplot unrooted Mavric PARBOOT GDE SeqPup ARB Singapore PHYLIP web interface
TreeDis
Tree drawing
PHYLIP PAUP*
Data management or job submission
Random Cladistics MUST
Singapore PHYLIP web interface
Teaching
http://evolution.genetics.washington.edu/phylip/software.xref.html (6 of 7) [14/11/2000 5:28:52 pm]
Phylogeny programs by method and system
Back to the main Phylogeny Programs page
http://evolution.genetics.washington.edu/phylip/software.xref.html (7 of 7) [14/11/2000 5:28:52 pm]
Old Phylogeny Programs (ones no longer distributed)
To go to top of Software page
Old Phylogeny Programs
(ones no longer distributed)
The programs described in this web page were formerly listed in the Phylogeny Programs web pages but are ones which are no longer distributed. The entries describing them have been moved here. The web links and ftp addresses in these listings typically do not work any longer. It may be possible to track down the authors of these packages and get information directly from them. I have added comments at the top of some entries as to what I know about its status.
Allan Dickermann, then of the University of Arizona, had a server version of his program HyperPars which analyzed data with recombining sequences, reconstructing histories that included recombinations. The Houston ftp archive maintained by Dan Davison at the University of Houston (ftp.bchs.uh.edu) seems to have become inactive. Two groups that once made PHYLIP available on a server basis: 1. The Schering-Plough Research Institute at B oston University, Boston, Massachusetts has a server available using the Singapore software at cyrus.bu.edu/phylip/index.html. 2. The Wistar Institute at the University of Pennsylvania, in Philadelphia, has made PHYLIP available as a server using the Singapore software at http://sing.wistar.upenn.edu:8888/phylip/index.html.
(This program is one that is still in distribution. It has been removed from the list, not because there is anything wrong with it, but because I have reconsidered where the boundary of this listing is. As a data-conversion utility it lies outside it.) Mauro J. Cavalcanti (maurobio@geocities.com), o f the Departamento de Biologia Geral, Universidade Santa Ursula, Rio de Janeiro, Brazil, has written Tonex, a data set translation program for converting Hennig86 data sets into Nexus format. It is available as MSDOS and Windows executables and as Turbo Pascal 7.0 source code at its web site at http://www.geocities.com/RainForest/Vines/8695/software.html#Cladistics . Not sure this one has been in distribution any time in the last 10 years.
http://evolution.genetics.washington.edu/phylip/software.old.html (1 of 4) [14/11/2000 5:29:04 pm]
Old Phylogeny Programs (ones no longer distributed)
J. S. Farris and Mary Mickevich earlier released a package of phylogeny programs, PHYSYS, which, at about $5,000, was extremely expensive (in my opinion, which is certainly a biased one). I am not sure whether, from whom, or under what conditions it is still available. Kevin Nixon has withdrawn ClaDOS: it is replaced by his more recent program WINCLADA which incorporates the features of ClaDOS and of another program, DADA. ClaDOS, an interactive program which allow s rearrangement of trees and their evaluation, mapping of characters into them, and more, is available for DOS systems from Kevin Nixon, L. H. Bailey Hortorium, Cornell University, 467 Mann Library, Ithaca, New York 14853. Rumor has it that the cost is in the vicinity of $55 US. I do not know whether these programs are being distributed right now, and if so, from where. Andrey Zharkikh (zharkikh@hgc6.sph.uth.tmc.edu ), (then) of the Genetics Centers at the University of Texas at Houston, has written a series of Unix programs to carry out various phylogeny methods. They are easily compiled on standard Unix C compilers. They include programs for q Translating, reformatting and printing aligned sequences q Calculation of evolutionary distances among sequences q Inferring phylogenetic trees and bootstrapping, by parsimony or neighbor-joining distance methods q Drawing the tree structure to a PostScript printer These are available by World Wide Web from http://hgc6.sph.uth.tmc.edu:8080/bootstrap.dir/index.html or by anonymous ftp from hgc6.sph.uth.tmc.edu in directory pub/zharkikh/bootstrap. I do not think this program is currently publicly distributed. James Lake (lake@uclaue.mbi.ucla.edu) distributes Evomony, a program for using the "evolutionary parsimony" (invariants) method for inferring phylogenies from DNA or RNA sequences. It runs on 286 or higher DOS systems with at least 500k bytes of memory. A Macintosh version was also contemplated. I do not know what the current distribution arrangements are. Lake's address is Department of Biology, University of California, Los Angeles, California 90024. This server no longer responds and that unit has no pointers to these programs on its web pages. I do not know of any place distributing them. Andrey Zharkikh (zharkikh@hgc6.sph.uth.tmc.edu) of the Genetics Centers at the University of Texas Health Sciences Center in Houston has programs for double-bootstrapping of nucleotide sequences, using his innovative complete-and-partial bootstrap method for getting less biased P values. They are available free by World Wide Web at http://hgc6.sph.uth.tmc.edu:8080/CP-bootstrap.dir, or by anonymous ftp at
http://evolution.genetics.washington.edu/phylip/software.old.html (2 of 4) [14/11/2000 5:29:04 pm]
Old Phylogeny Programs (ones no longer distributed)
hgc6.sph.uth.tmc.edu/pub/zharkikh/double-bootstrap. His technique is described in the paper by Zharkikh, A. and W.-H. Li. 1995. Estimation of confidence in phylogeny: Complete-and-Partial bootstrap technique. Molecular Phylogenetics and Evolution 4:i 44-63 and briefly reviewed in Li, W.-H. and A. Zharkikh. 1995. Statistical tests of DNA phylogenies. Systematic Biology 44: 49-63. There is no sign that Fujitsu still distributes this package, which is a version of a phylogeny program called IDEN that is used at the National Institute of Genetics, Japan (but is not in general distribution). Fujitsu Ltd. ("a $21 billion [now it's 37.7 billion] global leader in advanced computer, telecommunications, and electronic devices") sells a Fujitsu S family workstation complete with a program, SINCAIDEN, which allows "experimental researchers, even those unfamiliar with such analyses, [to] easily create phylogenetic trees in their own laboratories." The program also allows searches of the major nucleic acid sequence and protein databases (the ad I saw does not make it clear whether these databases are provided with the workstation). The methods available are UPGMA, neighbor-joining, Farris's (Distance Wagner) and the modified Farris distance matrix methods. The workstation is SPARC compatible and runs SunOS. The SINCAIDEN program was developed by the group at the National Institute of Genetics, Japan under Dr. Takashi Gojobori. Fujitsu Ltd. may be contacted at 21-8, Nishi-Shinbashi 3- chome, Minato-ku, Tokyo 105, Japan (phone 81-3-3437-5111 ext. 2831, fax 81-3- 5472-4354), or in the U.S. at Fujitsu America Inc., 3055 Orchard Drive, San Jose, California 95134-2017 (phone 1-408-432-1300 ext. 5168, fax 1-408-434- 1045). There is a web page in Japanese at http://sinca.fqs.co.jp/InfoSINCA/. Several years ago the price of SINCAIDEN (with workstation) was $28,000. This teaching program seems no longer to be in distribution (it is not clear that its author decided this). Arnold G. Kluge (akluge@umich.edu), of the Department of Biology of the University of Michigan, has written Systack, a teaching program designed to teach the principles of synapomorphy/homology analysis in the context of chordate phylogeny. It implements a hierarchical filing system in the form of a phylogeny, with character information available on chordates and with users able to add new characters. It is a Hypercard stack for Macintosh computers, and is available free for noncommercial use. A Web site is available at http://www.ummz.lsa.umich.edu/herps/systack.html to download it. Perhaps this one has expired because Michael Schoeniger is no longer working there. Johannes Schaefer and Michael Schoeniger of the Lehrstuhl für Theoretische Chemie of the Technische Universität München (schaefer@theochem.tu-muenchen.de and schoeniger@theochem.tu-muenchen.de) have written DISTREE. It computes pairwise distances (substitution rates) of aligned nucleotide sequences utilizing various models of base substitution. Moreover it provides the user with information on the goodness of fit of the models to the given set of sequence data. Each of the models is implemented in two variants, assuming identical and gamma
http://evolution.genetics.washington.edu/phylip/software.old.html (3 of 4) [14/11/2000 5:29:04 pm]
Old Phylogeny Programs (ones no longer distributed)
distributed substitution rates across sequence sites. It is available as a DOS executable with C source code, or as source code for Unix systems. DISTREE is distributed through its Web page at http://evol10.theochem.tu-muenchen.de/pub,
The ftp links from this page have gone dead John Brzustowski (jbrzusto@gpu.srv.ualberta.ca) of the Department of Biological Sciences of the University of Alberta, Canada, has written qclust, a program to carry out a number of clustering methods including Neighbor-Joining. The neighbor-joining method has been improved over our own Neighbor program, so as to be able to handle large numbers of taxa much more quickly. The program is available along with another program, calcdist which calculates distances from 0/1 data. The programs are available as C source and as DOS executables from its web page at http://www.biology.ualberta.ca/jbrzusto/dosclust.html. It is also available as Java from a web page at http://www.biology.ualberta.ca/jbrzusto/cluster.html and as a web server. These are also available by anonymous ftp from www.biology.ualberta.ca in directory pub/public/jbrzusto/trees. This program has been withdrawn because it is now contained within their later system vCEBL. Allen Rodrigo of the Computational and Evolutionary Biology Laboratory, School of Biological Sciences, University of Auckland, Auckland, New Zealand (a.rodrigo@auckland.ac.nz) distributes sUPGMA (Serial sample UPGMA). It reconstructs evolutionary histories/genealogies under the assumption of a molecular clock when sequences are obtained serially in time. Input is a distance matrix. sUPGMA can also carry out the ordinary UPGMA, WGPMA and Neighbor-Joining methods. sUPGMA is a Java applet. It is available for downloading from its web site at http://www.cebl.auckland.ac.nz/"> or for use as a server. sUPGMA runs under Netscape Communicator 4.7 or Microsoft Internet Explorer 5 on PCs running Windows 95/98, and Microsoft Internet Explorer 4.5 on Macintoshes. It can also be run in standalone mode if you have a Java runtime engine. For similar reasons the sUPGMA server has been discontinued. Allen Rodrigo of the Computational and Evolutionary Biology Laboratory, School of Biological Sciences, University of Auckland, Auckland, New Zealand has a server which runs sUPGMA (Serial sample UPGMA). It reconstructs evolutionary histories/genealogies from distance matrices under the assumption of a molecular clock when sequences are obtained serially in time. It can also carry out UPGMA and WPGMA clustering as well as Neighbor-Joining. The Java code for sUPGMA is also available.
http://evolution.genetics.washington.edu/phylip/software.old.html (4 of 4) [14/11/2000 5:29:04 pm]
Programs that are and are not listed
To go to top of Software pages
Programs that are and are not
listed
Setting the boundaries of a listing such as this is a difficult task, but there have to be some boundaries. I have decided that this list will not include: q Data conversion utilities. These are important and useful programs that convert your data from the format needed by one program to the format needed by another. As they do not themselves perform any data analysis, I have reluctantly decided not to list them. One, Tonex was formerly listed here and has been removed from the list. q Clustering programs that are not intended to infer phylogenies. There is a large and active literature on clustering methods, with many programs in distribution. However making clusters and inferring phylogenies are not the same thing (though closely related). I have tried to decide which packages had some material on inferring phylogenies, and include only those. q Tree alignment programs that do not infer trees. Programs or servers that use tree-based alignment of sequences, but do not allow you to output the resulting tree, fall slightly outside of this listing, as any tree that they have is not visible to the outside world. This means that some ClustalW servers are not listed here (those that do not allow you access to the tree).
q
q
Programs for detecting evidence of recombination that are not using trees intensively enough. For example, there are useful and helpful tests that look at pairs of sites but not at anything involving trees more directly. This is a particularly difficult boundary to define but there has to be one here somewhere. Sequence editors that do not submit jobs to phylogeny programs. I have listed a number of sequence-editing environments that allow one to automatically invoke phylogeny programs on selected sets of sequences or sets of sites. Sequence editors that do not have these job submission capabilities lie outside the domain of this listing.
I will continue to agonize over these boundaries, but for now this is the policy. If you feel that these policies are wrong, you can always (a) argue with me, or (b) make your own phylogeny programs web pages.
http://evolution.genetics.washington.edu/phylip/software.not.html [14/11/2000 5:29:09 pm]