<%@ Language=JavaScript %> Mol. Systematics

Q. Q. Fang Homepage

| Q. Fang Home | Course Information | Study Guide | Reference Room | Graduate Programs | Lab_Protocols | Mol. Systematics | Data Analysis | Useful Web Sites |


Today is






Welcome !

Q. Fang Home
Course Information
Graduate Programs
Study Guide
Reference Room
Lab Protocols
Molecular Systematics
Useful Links

[ Returned to Biology Home ]

Molecular Systematics and Population Genetics

The simplest definition for molecular systematics is that study systematics using molecular data or study systematics at the molecular level.  Reconstruction of phylogenetic relationships of organisms on earth (building tree of life) using molecular data is one of the major aims of systematic biology today.  


Hillis et al. (1996)* defined three main applications of molecular systematics:

  1. Reconstruction of phylogenetic relationships of organisms.

  2. Studies of population structure, including geographic variation, mating systems, heterozygosity, and individual relatedness.

  3. Identification of species boundaries including hybridization. 

* See Hillis, David M., Craig Moritz, and Barbara K. Mable. 1996. Molecular Systematics. 2nd ed. Sinauer Associates, Inc. & Publishers, Sunderland, Massachusetts, USA.


Phylogenetic Analysis of Molecular Data 

The principle of analyzing molecular data is similar to that of binary numerical data coded from morphological characters.  However, which method should be used for particular molecular technique depending upon the principles and assumptions of the particular techniques.  The methods employed should not violate the principles and assumptions. 


Type of Data

All the experimental data gathered by molecular techniques fall into one of two broad categories: discrete characters, and similarities (or distance).  The discrete data are qualitative data with the possible states of two or more discrete values.  The distance (or similarities) data are quantitative data with character of varying continuously and measuring on an interval scale.


Maximum Parsimony (MP) method

In this method, the DNA (or amino acid) sequences of ancestral species are inferred from those of extant species, considering a particular tree topology, and the minimum number of evolutionary changes that are required to explain all the observed differences among the sequences is computed.  This number is obtained for all possible topologies, and the topology which shows the smallest number of evolutionary changes is chosen as the final tree.

This method is used mainly for finding the topology of a tree, but branch lengths can be estimated under certain assumptions.  When the MP method is applied to morphological characters, it is customary to assume that the primitive and derived character states are known. 

In the case of molecular data, this assumption generally does not hold, and different character states are often reversible.  It is, therefore, important to use the MP method, which permits reversible mutations.  In numerical taxonomy , this type of MP method is sometimes called the Wagner parsimony method. 


Neighbor-Joining method (Saitou & Nei, 1987)

In contrast to cluster analysis, neighbor joining keeps track of nodes on a tree rather than taxa or clusters of taxa.  The raw data are provided as  a distance matrix, and the initial tree is a star tree.  A modified distance matrix is constructed in which the separation between each pair of nodes is adjusted on the basis of their average divergence from all other nodes. 

The tree is constructed by linking the least-distant pair of nodes as defined by this modified matrix.  When two nodes are linked, their common ancestral node is added to the tree and the terminal nodes with their respective branches are removed from the tree. 


Maximum Likelihood (ML) method (Felsenstein, 1981)

In this method, the nucleotides of all DNA sequences at each nucleotide site are considered separately, and the log-likelihood of having these nucleotides are computed for a given topology by using a particular probability model. 

This log-likelihood is added for all nucleotide sites, and the sum of the log-likelihood is maximized to estimate the branch length of the tree.  This procedure is repeated for all possible topologies, and topology that shows the highest likelihood is chosen as the final one.


Bayesian Analysis

Bayesian analysis is a phylogenetic analysis method developed recently (Rannala and Yang 1996, Mau and Newton 1997, Mau et al. 1999).  It is very similar to that of maximum likelihood method, but differing in the notion of posterior probabilities: probabilities that are estimated, based on some model (prior expectations).  Instead of seeking the tree that maximizes the likelihood of observing the data, it seeks those trees with the greatest likelihood of given the data. Bayesian analysis produces a set of trees of roughly equal likelihoods. 


Super Tree

Supertrees are phylogenies (rooted evolutionary trees) assembled from smaller phylogenies that share some but not necessarily all taxa (leaf nodes) in common. Thus, supertrees can make novel statements about relationships of taxa that do not co-occur on any single input tree while still retaining hierarchical information from the input trees. As a method of combining existing phylogenetic information, supertrees potentially solve many of the problems associated with other methods (e.g., absence of homologous characters, incompatible data types, or non-overlapping sets of taxa). In addition to helping synthesize hypotheses of relationships among larger sets of taxa, supertrees can suggest optimal strategies for taxon sampling (either for future supertree construction or for experimental design issues such as choice of outgroups), can reveal emerging patterns in the large knowledge base of phylogenies currently in the literature, and can provide useful tools for comparative biologists who frequently have information about variation across much broader sets of taxa than those found in any one tree. (from: M. J. Sanderson, D. Gusfield, and O. Eulenstein)


Links to Sites of Systematics and Phylogenetics

Computer Software Packages Available for Data Analysis

  • Phylogeny Programs: Joe Felsenstein's site lists 387 phylogeny reconstruction packages and 28 free servers (Department of Genome Sciences and the Department of Biology at the University of Washington).

  • PAUP* 4.0 - Phylogenetic Analysis Using Parsimony and other Methods.

  • PHYLIP - package of programs for inferring phylogenies.

  • MacClade - An Useful Software package for Phylogenetic Analysis.

  • Mesquite -  a modular system for evolutionary analysis, by W. P. Maddison and D. R. Maddison. 

  • Hennig86 -  A PC-DOS program for phylogenetic analysis.

  • MEGA -  Molecular Evolutionary Genetics Analysis.  MEGA is a DOS program for analyzing molecular data.  Developed by Sudhir Kumar, Koichiro Tamura and Masatoshi Nei (1993). 

  • MrBayes - is a program for constructing phylogenetic trees by Bayesian method (Huelsenbeck 2000).  

  • DAMBE - Data Analysis in Molecular Biology and Evolution, developed by X. Xia at the University of Hong Kong. 

  • LVB -  Reconstructing evolution with parsimony and simulated annealing.

  • Other Programs -  Maintained by Dr. Joe Felsenstein, Department of Genetics, The University of Washington.

  • Clann - Constructing Supertrees from partially-overlapping datasets, designed and written by Chris Creevey at the Bioinformatics and Pharmacogenomics Laboratory at NUI Maynooth..

  • SuperTree software by Nicolas Salamin, Biophore - 1015 Lausanne - Switzerland.
  • Supertree Server by Mike Sanderson, D. Gusfield, and Oliver Eulenstein, Computational Biology Laboratory, Department of Computer Science Iowa State University, Ames, IA.

  • Supertree software by Roderic D. M. Page, DEEB, IBLS, University of Glasgow, Glasgow G12 8QQ, United Kingdom.

  • TreeBASE by Bill Piel, M. Donoghue and Mike Sanderson.  TreeBASE is a relational database of phylogenetic information hosted by the Yale Peabody Museum. 



Population Genetics Links 

Other Programs 

  • Softwareseek - a repository and database of freely-distributable and commercial tools for use in molecular biology. 

  • Tree View - a simple program for displaying phylogenies on Apple Macintosh and Windows PCs.

 [ Returned to Biology Home ]
GSU Home
Biology Home
College Sci & Tech

I.A.P. Home

U.S. Tick Collection
Fang's Lab




| Q. Fang Home | Course Information | Study Guide | Reference Room | Graduate Programs | Lab_Protocols | Mol. Systematics | Data Analysis | Useful Web Sites |

This page has been accessed 125610 times since January 1, 2001

Copyright  1997-present by Q. Q. Fang 
Last modified: November 21, 2008