Anna Gloyn and Cecilia Lindgren participate in international consortia dedicated to finding genetic associations with diabetes, obesity and other metabolic conditions. Their latest study took a slightly different tack.
In 2003 the international Human Genome Project finally completed the sequence of all 3 billion letters that contain the information to make a human being. This tremendous achievement raised expectations that the genetic causes of disease would quickly emerge from comparisons of the genomes of patients with the reference genome. Experience has shown that it’s not that simple, but recent studies by us and other groups in diabetes begin to suggest how changes in the genome might work in concert to cause the disease.
Genome-wide association studies (GWAS) have been very successful at identifying single-letter changes, called SNPs that are associated with risk for diabetes or other measures of the body’s ability to manage levels of blood sugar and insulin levels. However, the vast majority of these SNPs are in parts of the sequence that do not code for proteins, which makes it very difficult to work out how they impact on our physiology. Most of the variants that have been identified to date have only a small effect on diabetes risk and account for only a small proportion of the variability in glucose and insulin levels in the healthy population.
As we need to study the genomes of lots of people to detect any differences, we collaborate with many research groups around the world to share data, giving us sample sizes in the tens and sometimes hundreds of thousands. Recently we took a slightly different approach to the standard GWAS. We focused on the coding sequences – less than 3 per cent of the genome – in over 30,000 healthy people, looking for associations of variants with their blood sugar and insulin levels when fasting. Our primary focus was on low frequency variants, present in less than one person in a hundred, which were not well captured previously by GWAS.
We found a number of ‘signals’, driven by genetic variants that altered protein sequences, including one in a protein which we know is important for the correct function of the cells which make and release the hormone insulin and is the target of a drug which is widely used for treating type 2 diabetes. Another variant, in a gene that makes an enzyme in the cells that produce and release insulin not only alters glucose levels in the normal population but also influences diabetes risk.
Our study has also helped unlock the biology at some previously reported GWAS loci. The vast majority of the previously reported variants sit outside protein coding regions of the genome, which makes it difficult to know which of the neighbouring genes are affected by them. Our study has identified protein-altering variants in some of these regions which are independently associated with glucose or insulin levels and this strongly suggests that the non-coding variants are working through altered regulation of these proteins. We’ve learned through out study that it is crucial to consider how groups of variants along the genome, in both coding and non-coding regions, might work together to influence differences in biological function or even disease risk.
Our approach certainly did not uncover all possible associated variants in the coding regions: the next step would be to sequence these regions in their entirety to discover even rarer mutations and evaluate their contribution to diabetes risk. We have found though that combining large scale genetic studies with targeted follow-up studies in cells to understand the impact of protein changes on function can help unlock some of the mysteries of how blood glucose levels are regulated in humans.
Anubha Mahajan et al, on behalf of the T2D-GENES consortium and GoT2D consortium, Identification and Functional Characterization of G6PC2 Coding Variants Influencing Glycemic Traits Define an Effector Transcript at the G6PC2-ABCB11 Locus. PLoS Genetics published: January 27, 2015, DOI:10.1371/journal.pgen.1004876