Exploiting genomic data to understand viral evolution
 
RNA viruses evolve quickly and RNA viruses are small (always less than ~32 kb, usually less than ~15 kb).  Both of these traits are attributed to the high mutation rate of RNA viruses, which itself is attributed to the lack of proofreading of RNA polymerases and the inability of error-correction enzymes to operate on RNA.  However, there are other kinds of size-limited viruses, such as ssDNA viruses; none larger than ~13kb have been found.  Do ssDNA viruses evolve as quickly as the similarly sized RNA viruses?
 
We use a Bayesian MCMC phylogenetics program called BEAST to analyze viral sequences sampled from nature at different times to estimate the rate of molecular evolution in ssDNA viruses. It calculates a distribution of the most likely nucleotide substitution rate: the number of fixed mutations (substitutions) per site per year.  If the evolution is completely neutral, then the substitution rate only reflects the mutation rate, so organisms that have a high neutral substitution rate must have a high mutation rate.  If there is substantial positive selection -- when a virus is adapting to a novel host, perhaps -- then the substitution rate can be higher than the neutral substitution rate.  Conversely, if there is purifying selection, as occurs in important protein-coding genes, the the substitution rate will be lower than the neutral substitution rate as most mutations are selected against and lost from the population.
 
The largest group of emerging ssDNA viruses are the plant pathogenic geminiviruses.  They are thought to evolve quickly and emerge easily on novel hosts because they recombine very frequently, not because they have high mutation rates. We tested whether the geminiviruses have high nucleotide substitution rates and evolve quickly even in the absence of recombination.  The tomato yellow leaf curl disease causing viruses, East African cassava mosaic viruses and two mastreviruses evolve as quickly as RNA viruses, even their coat protein genes, which are under substantial purifying selection. In the non-coding intergenic region, the substitution rates were even higher, indicating that the neutral substitution rate of geminiviruses should be as high as some RNA viruses.  This implies that the geminiviruses have comparable mutation rates to RNA viruses, and perhaps these mutation rates underlie their ability to emerge so frequently in novel hosts.
 
This is intriguing because ssDNA viruses are unlike most other DNA viruses and all RNA viruses.  ssDNA viruses don’t encode their own polymerase -- they use the polymerases of their host cells to replicate.  We know that these cells have mutation rates several orders of magnitude lower than what we would expect from our analyses, and lower than the measured mutation rates of two ssDNA bacteriophages.  We are currently using both bioinformatic and molecular techniques to elucidate how these small ssDNA viruses obtain a high mutation rate while using low mutation rate DNA polymerases to replicate.
 
 
 
Two attacks of the
killer tomato (virus)!
 
It had been thought that the Old World geminivirus, tomato yellow leaf curl virus, had been introduced into the New World once.  TYLCV from Israel first appeared in the Caribbean in the 1990s, and appeared to spread from there throughout North and Central America.  However, not all of the North American TYLCV genomes group together into a single clade -- the expected pattern from a single introduction (Duffy and Holmes 2007).  I found support for two clades of TYLCV -- one more closely related to the Israeli strain of TYLCV (isolates in this clade shown in orange) and another more closely related to Asian isolates (isolates shown in purple).  This indicates there has been a second, cryptic introduction of this devastating tomato pathogen into North America, most likely from across the Pacific.
www.worldatlas.com, permit use of outline maps without explicit permission