ESTIMATES OF DIVERGENCE TIMES
|Table of Contents:
Files and installation
How to use
(c) Copyright 2003 by Galina Glazko and the Pennsylvania State University. Permission is granted to copy this document provided that no fee is charged for it and that this copyright notice is not removed. TIMER is distributed free of charge by
Institute of Molecular Evolutionary Genetics
University of Rochester
TIMER is designed to estimate the divergence times using linearized-tree approach. Compared with the program LINTREE (Takezaki et al. 1995), TIMER has more user-friendly interface. More importantly, TIMER can be used to estimate divergence times for concatenated or very large set of genes (proteins). There are different statistical methods when multiple genes are used. For the detailed information, please refer to our paper (Nei et al., 2001).
In brief, TIMER can fulfill the following tasks:
(1) Constructing the phylogenetic trees by using neighbor-joining (NJ) method for individual gene (protein);
(2) Constructing the phylogenetic trees by using neighbor-joining (NJ) method for multiple genes (proteins);
(3) Estimating the branch lengths and divergence times for the individual genes (proteins);
(4) Estimating the branch lengths and divergence times for the concatenated genes (proteins);
(5) Running the Two-Cluster Test to check the rate constancy.
For the multiple genes (proteins) data, you can easily choose any combination of the genes from the list. The estimated divergence time could be easily compared with the estimation from individual gene (protein) or other combination of genes.
| Files and Installation
Download the file "timer.zip" onto your computer. Note that the program is only available for PC windows to date. The package contains the following files:
timer.exe - the executable file
input.dat - an example input data file containing multiple proteins alignment from 7 different species
readme.txt - the document file containing the introduction and help on use of the program
The simplest way to use the TIMER program is as following:
1. Unzip the package to a folder (i.e. c:\timer) using any compress/uncompress program (i.e. winzip, PowerArchive);
2. The TIMER program is ready to use by directly running the executable file "timer.exe". To run the file, you can try:
START menu -> "Run" option -> "browse" -> check "timer.exe" -> "open" -> "OK"; or type the "timer" from command prompt.
|How to use
I. Input Data format:
The sequence input file format is just like the MEGA input format, except:
(1) The gaps are excluded in the sequence alignment, same as the "complete deletion" option in MEGA.
(2) The name of the genes and species are marked by '>' and '#' symbols, respectively.
(3) The order of taxa should be the same for every gene under study, to analyze concatenated genes correctly.
(4) The taxon which will be taken as outgroup during the divergence time estimation will be placed lastly in the order of taxa. If you don't know which one to choose, refer to the standard phylogenetic trees as the best hint.
If you have K genes and N taxa, the input file should look like this:
Also, you can refer to the sample input data file included in the package, "input.dat".
II. Analysis scheme
Once you run the program, you will see the five different menus on the main window:
"File options", "Calibration", "Gene selection", "Individual estimates" and "Multigene estimates". The first three menus are for you to enter the input data file, optimize different parameters for time estimation, and select individual or multiple genes (proteins). The last two menus are used for the actual analysis of time estimation for individual or multiple genes (proteins). The detailed description is as following:
(1) "File options" menu:
Load your input data in the window of "Select input file"; and select the folder where you want to keep your analysis result in the bottom window;
(2) "Calibration" menu: Specify calibration parameters in the left panel:
a. "species name":the name of calibration taxon;
b. "Calibration time, Ma": the assumed calibration time for this specific taxon (diverged from other taxas);
c. "Alpha parameter": alpha value if the gamma distribution model is used.
Note: The alpha value could be estimated with some external programs (i.e. GAMMA, Gu and Zhang, 1997).
Specify input data type in the right panel:
a. For proteins, you can use either "Poisson corrected" model or
"Gamma distance" model;
b. For genes, three options ("Kimura", "Kimura gamma" and "Jukes-Cantor
distances" are available.
(3) "Gene selection" menu:
Select either individual or multiple genes (proteins) for analysis. Also, any kind of combination of genes (proteins) could be easily selected from the list on the left.
(4) Analysis for the individual and Multiple genes (proteins):
The two menu have the same content including 3 windows. The right window shows the genes (proteins) you selected in the "Gene selection" menu (see above). Once you click the button "Analyze!" below, the analysis will start and the result will be shown in the middle and right windows. The right window shows the neighbor joining (NJ) tree constructed. The middle window shows the estimated divergence times for each internal node, the Z test value for the evolutionary rate test, and the estimated branch length with standard error. The detailed description is as following:
a. DTE + SE column: the estimated divergence time for each internal node and standard error computed by bootstrap.
b. Z-test column: the value of Two-Cluster test (Takezaki et al., 1995) for individual node.
c. BL + SE column: the estimated branch length for external branches and heights for internal nodes, together with standard error computed by bootstrap.
Note: The standard error was computed by the bootstrap method. In this process, gene rather than amino acid or nucleotide site was used as the unit.
Tip: For the first time, you might need to analyze each gene separately to understand its evolutionary rate (refer to Z-test value). If the gene evolves too fast or too slow, you might need to exclude it from the further time estimation using multiple genes (proteins). Try different combinations and see what is the difference of the time estimation.
You can save the result of analysis easily by clicking the small disk icon above the main window or by checking the "save" option in the "session" main menu on the top. Two files will be saved in the output folder you choose before: "Multigene.bmp" and "Multigene" if you select multiple genes (proteins); or "gene_name.bmp" and "gene_name" if you select individual gene (protein).
The file "*.bmp" contains the NJ tree shown in the right window (see above). Also you can copy the inferred tree from the window directly to the word document ("Ctrl+C"). The other text file contains a table which includes three columns shown in the middle window (see above).
1. The input data format should be checked carefully. Try to avoid blank characters and empty strings between genes and taxa.
2. The program was not tested on the very large data set which has more than 10 species and 200 genes. Sometimes the program is unable to estimate the standard error of divergence time (in the case of individual gene analysis) when the protein length is very short.
Please contact firstname.lastname@example.org (Galina Glazko) if you find any problems and trouble in using this program. Thank you for cooperation.
Takezaki, N., A. Rzhetsky, and M. Nei (1995) Phylogenetic test of the molecular clock and linearized trees. Mol. Biol. Evol. 12:823-833.
Gu, X. and J. Zhang (1997) A simple method for estimating the parameter of substitution rate variation among sites. Mol. Biol. Evol. 14:1106-1113.
Nei, M., P. Xu, G. Glazko (2001) Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms. Proc. Natl. Acad. Sci. 98:2497-2502.
Glazko, G. V. and M. Nei (2003) Estimation of divergence times for major lineages of primate species. Mol. Biol. Evol. 20:424-434.
| Department of Biology |
Eberly College of Science |