A Modified Nei-Gojobori Method for Computing Synonymous and Nonsynonymous Distances
Institute of Molecular Evolutionary Genetics
and Department of Biology
322 Mueller Laboratory
The Pennsylvania State University
University Park, PA 16802, USA
Associate Professor of Ecology
and Evolutionary Biology
University of Michigan
Ann Arbor, MI
Zhang J., H. F. Rosenberg, and M. Nei (1998) Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc. Natl. Acad. Sci. USA 95:3708-3713
NG-NEW is designed for estimating synonymous and nonsynonymous distances between protein coding DNA sequences. The method is modified from the original Nei and Gojobori (1986) method to take into account the transition bias. The program is written in C language and can be used on IBM PC compatible computers with the windows95 operating system.
First make sure that the diskette you have received contains the following files.
To install NG-NEW on your computer's hard disk drive ("C" drive given
here, for example), you should create a directory where the files of this
package will be present. To do this, type the following c:\md ng-new
To use the program, you need an input file containing the protein coding DNA sequences (see rnase.seq for an example). This file begins with two numbers: the number of sequences and the number of nucleotides per sequence (sequence length). The second line will be the name of the first sequence, and the third line will be the first sequence, and so on. Only A, G, C, T, a, g, c, and t are allowed. Gaps should be removed and sequences should be aligned beforehand. The sequences should only include protein-coding regions, with stop codons removed.
To compute S, N, s, n, ps, pn, ds, and dn, type c:\ng-new\ng-new filename
For example, to try the rnase.seq data, type c:\ng-new\ng-new rnase.seq
You will be asked to input the transition/transversion ratio (R), which should be estimated beforehand. If you want to use the original Nei-Gojobori method, input R=0.5. The variances and covariances of distances are computed according to Ota and Nei (1994).
There are several output files with different formats.
(1) outfile: this is most useful, including S, N, s, n, ps, pn, ds, dn, and variances.
(2) sn.rst: this file includes covariances, in addition to those quantities given in outfile.
(3) s.dis: this file is used as an input file for bn-bs.exe.
(4) n.dis: this file is used as an input file for bn-bs.exe.
The files sn.rst, s.dis, and n.dis are generated only when ng-new.exe is used.
R: transition/transversion ratio. R=0.5 means no transition bias. Note that R is not the transition/transversion rate ratio (which is often denoted by kapa). Under Kimura's model, 2R=kapa.
S: number of synonymous sites of a sequence.
N: number of nonsynonymous sites of a sequence.
s: number of synonymous differences between two sequences.
n: number of nonsynonymous differences between two sequences.
ps: p-distance (proportion) of synonymous difference.
pn: p-distance (proportion) of nonsynonymous difference.
ds: Jukes-Cantor distance of synonymous difference.
dn: Jukes-Cantor distance of nonsynonymous difference.
| Department of Biology |
Eberly College of Science |