Malaria parasites adapt at a frightening rate. To mark World Malaria Day on 25 April 2015, Roberto Amato describes a new global collaboration that has compiled the largest collection of open access P. falciparum genomes and is using this resource to try and keep up. (Cross-posted from the Sanger blog.)
Plasmodium falciparum parasites are responsible for the majority of over 500,000 malarial deaths every year. An adaptive foe, these parasites can hide from the body’s immune system, cope with changes in the Anopheles (mosquito) vector, and develop resistance to antimalarial drugs, at a frightening rate.
Genomics is one of the most powerful tools available to observe these evolutionary processes in action. Much of our early work studying natural genetic variation in Plasmodium parasites came about in collaboration with many different researchers around the world as part of the MalariaGEN P. falciparum Community Project. To date, this collaboration has built a catalogue of 1 million single nucleotide polymorphisms (SNPs) in more than 6,000 falciparum samples collected directly from malaria patients in Africa, Asia, Latin America and Oceania.
Using this rich data resource – the largest collection of Plasmodium genomes in the world – we are starting to understand the complex genetics of Plasmodium parasites. For example, we have explored the intricate genetic architecture underpinning resistance to the frontline drug, artemisinin.
But this is just the beginning. We are still far from a comprehensive and precise understanding of how this parasite evolves in in the wild and how we should respond to these constant changes. There are of course limitations with our current methods, but beyond that our view of genetic variation is primarily based on SNPs, leaving out other forms of variation such as indels (short insertions or deletions). Neither are we yet able accurately to detect changes in key regions of the Plasmodium genome including, for example, the hypervariable var genes, which contribute to the parasites’ ability to evade our immune system.
To generate a more complete, fine-grained view of genetic variation in Plasmodium parasites, we need solid reference genomes, good baseline data and reliable analytical methods to lay solid foundations for future analyses.
These technical challenges are the key focus for the pilot phase of the Pf3k project, a global collaboration led by researchers at the Wellcome Trust Sanger Institute, the University of Oxford and the Broad Institute. Established within the past year, the Pf3k Consortium aims to analyse 3,000 P. falciparum samples from the major malaria-endemic regions of the world. The overall aim is to provide a high-resolution view of natural variation in P. falciparum including those regions of the genome that are inaccessible using standard methods.
At the moment, we are very busy generating thousands of whole genomes from field samples that can act as high-quality reference genomes and assessing various methods to genotype them. This is a big leap forward from the current gold standard of using one reference, 3D7 v3, which is the whole genome sequence of a single parasite. This limits our ability to access the genome, particularly in regions that differ from the reference.
One good example is the challenges in genotyping crt, a clinically-significant gene involved in choloroquine resistance – and possibly with a role in emerging artemisinin resistance. This gene is so important that it remains one of the first places researchers tend to look.
The current reference has a very specific version of crt which is quite different from what we see in most genomes in Southeast Asia. And crt in Southeast Asia is again different from what we observe in other parts of the world. This geographical diversity makes aligning sequences from various parts of the world challenging; having reference genomes drawn from different populations will allow us more readily to compare like with like and, ultimately, increase the accuracy with which we can spot variants.
The Pf3k Consortium has prepared an initial data set comprising 2,375 samples sequenced at the Sanger Institute as well as 137 samples from our colleagues at the Broad Institute in Boston, USA. This represents the full pilot set of samples, collected in major malaria-endemic regions in Africa and Asia.
Reflecting our commitment to the early and open release of data, earlier this month the Pf3k Consortium made this large data set public. As with previous Pf3k data releases, these data are made available under Fort Lauderdale conditions and can be downloaded or explored using a user-friendly web application designed by colleagues at the MRC Centre for Genomics and Global Health.
Our attention is now focused on evaluating the methods used to generate this baseline data. Often optimised for human genomes, we need to understand to what degree these methods can be used ‘straight off-the-shelf’ to analyse the Plasmodium genome, which differs in many ways from ours.
It may sound surprising but even some basic concepts, like allele frequencies and genetic distance, are not straightforward when dealing with Plasmodium genomes. When samples come directly from a patient, we’re getting not a single parasite, but a population of parasites. Depending on a variety of ecological and epidemiological factors, these populations may be so inbred that they appear as a single genome (‘clonal sample’) or may be very diverse (‘mixed infections’).
A strange consequence of mixed infection is that some Plasmodium genomes look as though they have an extra set of chromosomes at certain positions. To further increase the complexity, in areas where other Plasmodium parasites are co-endemic, these populations might even be made of different species.
As we improve the resolution and accuracy of our analyses, we’ll be able to delve deeper into key scientific questions, such as how populations of Plasmodium parasites are evolving, migrating to different locations and developing drug resistance.
Roberto Amato is now based in the Wellcome Trust Sanger Institute’s Malaria Programme, but continues to work closely with colleagues in the Kwiatkowski group at WTCHG.