P. vivax sequencing projects

A number of genome-scale P. vivax projects have been initiated over the past ten years, including the recently-completed Salvador I reference genome, and even more recently, a project to map P. vivax genetic diversity by sequencing six P. vivax laboratory isolates

  1. The P. vivax genetic diversity map
  2. A project to sequence six P.vivax laboratory-adapted strains has recently been funded through the NIAID/NIH Microbial Sequencing Contract (Spring 2008). Details concerning the project can be obtained from the white paper submitted by members of the vivax community to the joint NHGRI/NIAID Eukaryotic Pathogens and Vectors working group assigned to facilitate the approval of such proposals by NIH.

    Briefly, P. vivax lab strains India VII, North Korean, Indonesia XIX, AMRU I, Brazil I, and Mauritania I were chosen for their different geographical origins and phenotypes such as drug resistance and relapse type. Currently, parasite material is being generated with a view to whole-genome shotgun sequencing later in 2008 at the Broad Institute. As with the Salvador I sequencing project (see below), all sequence data will be deposited in public databases in a timely manner. For more details please contact Jane Carlton (jane.carlton@nyumc.org).

    A project to sequence nine more P. vivax lab strains and isolates of the closely related non-human primate malaria species P. cynomolgi, P. inui, P. coatneyi, and P. fragile was recently approved by NHGRI. Another white paper outlines this proposal.

  3. The TIGR P. vivax genome sequencing project
  4. Back-to-back papers describing the genome sequences of P.vivax and P. knowlesi, and comparative analysis with P. falciparum, are currently under review at Nature. We hope these papers will be published by late Summer 2008.

    The whole genome shotgun (WGS) sequencing project of P.vivax laboratory-adapted Salvador I strain is close to completion at The Institute for Genomic Research. Started in the Fall of 2001, this project used remaining funds from the NIAID and US DoD-funded P. falciparum sequencing project at TIGR, with the aim of generating a P.vivax genome sequence as good as if not better than that of P. falciparum. The project was halted for nine months in 2004 due to a lack of funds, but was rescued by funding from the Burroughs Wellcome Fund and NIAID's Microbial Sequencing Contract, until April 2006.

    The Salvador I strain of P.vivax , isolated from a naturally acquired infection of a patient from El Salvador (Collins et al., 1972), was chosen for sequencing. This strain has been passaged through human volunteers and Aotus (owl) and Saimiri (squirrel) monkeys by mosquito and blood infection, it has been the subject of drug susceptibility and relapse activity studies (Contacos et al., 1972), and it has been used to test the immunogenicity and protective efficacy of recombinant antigen constructs (Collins et al., 1997; Yang et al., 1997). Salvador I chromosomes can be separated by pulsed-field gel electrophoresis for karyotype and physical mapping studies (see below), and more than 7,000 GSSs have been generated for this strain (see above). Thus, like the 3D7 clone of P. falciparum, it is often regarded as the standard reference strain for P.vivax . Genomic DNA for the random sequencing phase of the project was provided by John Barnwell at the Centers for Disease Control, from parasites grown in splenectomized Saimiri monkeys.

    All genome sequence data has been made freely available throughout the course of the project via TIGR's P.vivax -specific web pages, GenBank and PlasmoDB:

  5. Partial genomic sequencing projects
    • Sequence of a 150 kb internal region of one P. vivax chromosome, cloned as part of a YAC library from a patient isolate: PMID: 11298455
    • Sequence of a 200 kb subtelomeric region of one chromosome, cloned as part of the same YAC library: PMID: 11738711

  6. EST, GSS and flcDNA projects
    • ~20,000 GSS (genome survey sequences) generated from Salvador I and Belem laboratory-adapted strains PMID: 11738710
    • ~800 ESTs (expressed sequence tags) from a cDNA library of a Brazilian patient isolate: PMID: 12914668
    • ~20,000 ESTs from a cDNA library of a Thai patient isolate: PMID: 16085323
    • ~1500 putative flcDNAs (full-length cDNA sequences) from Indonesian patients PMID: 17151081

  7. Microarray and proteomics projects
    • No studies concerning gene or protein expression have been published yet. Long-oligo microarrays are available free of charge (see Resources).

What do we know about the P. vivax genome?

Genome size & chromosome number

P. vivax nuclear DNA is distributed between 14 linear chromosomes that range in size from 1.2 to 3.5 Mb (Carlton, et al., 1999). Initial estimates put the genome size at 35-40 Mb based on pulsed-field gel separation of P.vivax chromosomes, but this is likely to be an overestimate based upon the final 23 Mb genome size of P. falciparum (Gardner et al., 2002), P. yoelii (Carlton et al., 2002) and other Plasmodium species (Carlton et al., 2004). A large scale sampling of approximately 20% of the P.vivax genome (11,000 genome survey sequences, GSSs), has given some insight into the coding potential of the parasite (Carlton et al., 2001). Homologs of previously identified Plasmodium genes were identified, and similar numbers of proteins were found to be common between the P.vivax proteome and proteomes of P. falciparum and a rodent malaria species P. berghei. A significant finding has been the identification of a large multi-gene family of 600-1,000 variant genes termed the vir family (del Portillo et al., 2001), orthologs of which have been found in several rodent malaria species but not P. falciparum (Carlton et al., 2002; Janssen et al., 2001) . It seems probable that these genes, which appear to be located in subtelomeric regions of the chromosomes and which may play a role in antigenic variation of the parasite, will be found to make up a large proportion of P.vivax genes. Their essential role in interacting with and evasion of the host immune response makes elucidation of the complete repertoire of vir genes an important component of the P.vivax sequencing project.

vivax karyotype low MW gel  vivax karyotype high MW gel

Karyotype comparison of P. falciparum (clones 3D7 and Dd2) with P. vivax (Salvador I strain). Genomic DNA preparations were electrophoresed under conditions for resolving low or high molecular weight (MW) chromosomes. P. falciparum 3D7 chromosomes are numbered according to standard practice; Roman numerals are used for numbering the P. vivax Sal I strain chromosomes. The two rightmost lanes are a Southern blot of a P. vivax high MW separation that was probed with four radiolabelled P. vivax (Pv) sequences. Note the presence of a high MW, non-hybridizing band in the P. vivax lanes, which represents contaminating host genomic DNA. Abbreviations: Mb = megabases; PM = plasmepsin; CP = cysteine proteinase; ESP-1 = excreted soluble protein 1; MSP-1 = merozoite surface protein 1. Adapted from:

Carlton JM, Galinski MR, Barnwell JW, Dame JB
Karyotype and synteny among the chromosomes of all four species of human malaria parasite.
Mol Biochem Parasitol, 101(1-2):23-32.
Copyright (1999), adapted with permission from Elsevier.

Comparative Plasmodium genomics

Coming soon!
Much of the comparative analysis between the P.vivax genome and other Plasmodium species such as P. falciparum is unpublished. In order to abide by the publication rules of many peer-reviewed journals, we are unable to post details here until after the P.vivax genome paper has been accepted.