Genetics Unzipped is the podcast from the Genetics Society - one of the oldest learned societies dedicated to promoting research, training, teaching and public engagement in all areas of genetics. Find out more and apply to join at genetics.org.uk

Dr Joe Marsh: Understanding the role of Variants of Unknown Significance in rare disease

Dr Joe Marsh: Understanding the role of Variants of Unknown Significance in rare disease

Dr Joe Marsh

Image Courtesy of Joe Marsh & The University of Edinburgh

"Click here to listen to the full podcast episode"

Within the Institute of Genetics and Cancer at the University of Edinburgh, the MRC Human Genetics Unit is tackling this huge issue of how we interpret the genome to better understand and treat disease, with researchers working hand in hand with NHS staff including doctors and the NHS genetic testing services, as well as people affected by genetic disorders. 

Most alterations found in genes that encode proteins - those are the molecules in are cells that do stuff - are classified as “variants of unknown significance”, meaning that we don’t actually know how they affect the protein, and by extension, whether it’s involved in genetic disease. In turn, this can contribute to the length diagnostic odyssey faced by patients and their families.One of the researchers trying to unpick this challenge is Dr Joe Marsh, a group leader at the Human Genetics Unit. So - what do we know so far?

Joe: So what we're finding is that while there's lots of genetic disorders caused by these loss of function mutations that knock out normal gene activity, there's also a lot that have a lot more subtle effects at the protein level, so-called gain-of-function or dominant-negative variants. And in these cases what happens is the protein is still there, the protein is still doing something, but it's doing something damaging to the cells, damaging to the body, potentially toxic effects, or it's gaining some new activity. And the crucial thing about these loss-of-function variants is that they're a lot more difficult to detect. Because they have subtler effects at the protein level, they've often been missed by the current methodologies for screening novel variants.

Kat: So let's dig into what we're starting to understand in the genome. One of the things that you are looking into is this concept of "variants of unknown significance". So this, I guess, is where you're looking at the DNA sequence of a gene, of a protein coding gene. And you're going like, well, there's something not right in there, but what is it?

Joe: Yes so variants of uncertain significance is one of the most important problems in genetics at the moment. With the power of sequencing comes this huge amount of information along with it, which is all these variants that we're identifying. Any person that you sequence, you're gonna find a lot of variants.

Joe: And the fact is that most of them probably do nothing but some of them can do a lot. So how can we prioritise those small fraction that might have clinically relevant effects from this huge background of genetic variation that you observe in humans?

Joe: So ideally what we would have is some kind of atlas that we could look up when we get a new variant, you look up the gene and look up the variant and see what it's likely to do. And there's actually some exciting work going on in this regards now, both in terms of computational methods, so we can use computer programs to predict the effects of all possible variants, as well as new experimental methodologies, which can within a single experiment, measure the effects of all possible thousands or tens of thousands of variants within a single gene at once. That can give you this readout of the effects of every possible variant, which can be potentially of huge importance when it comes to the diagnosis of genetic disease.

Kat: So let's dig into these a bit, so how do these kind of advances in computation help you to understand what proteins are doing and then how these variations in the protein coding sequence actually affect what they might be up to?

Joe: So there's been some tremendous advances in computational methods for predicting variant effects over the past couple of decades. And primarily the most useful thing has been evolutionary information. So looking across evolutionary sequences of different species and seeing how that protein or the gene varies. And at positions where it can vary a lot, the variants tend not to have much in effect in humans. Whereas at positions where it's highly constrained across evolution, the variants are much more likely to have a big effect. So as we've had more and more evolutionary sequence information available, as well as rapid advances in machine learning methodologies, these evolutionary based sequence variant defect predictors have improved substantially.

Joe: But they're still fundamentally limited in terms of their predictive power. They're quite good at dismissing some variants as completely benign. But they still have a tendency to overcall variants as damaging. So they'll say lots of stuff is damaging, when in fact we look at people and see that they can safely have these variants.

Kat: So now let's turn to the idea of actually being able to test all these possible variations and see which ones are actually "bad" for want of a better word. How on earth do you do that?

Joe: So in the past, experimental testing of variants has been the cornerstone of understanding genotype to phenotype relationship. But it was very slow. You could make one mutant and you could test it in a cell...

Kat: Oh God, you're triggering me from PhD here!

Joe: Or you could make a mouse or now organelle techniques, for example, but it's very low throughput. But in the last few years, there's been these advances of these so-called Multiplex Assays of Variant Effect, or often called deep mutational scanning. So these are experiments where you can make a library of thousands or tens of thousands of variants at once. And using different types of phenotypic assays combined with deep sequencing technologies, you can measure the effects of all of these variants within a single experiment. And what it's turning out is that for certain genes, these high throughput experimental measurements of variant defects are tremendously predictive of phenotypic effects in humans.

Joe: So for certain genes, they're much better than any available computational approaches for distinguishing between benign variants as compared to disease causing variants. Now, the issue at the moment is that these experiments have only been performed on a fairly limited set of proteins within humans. So the challenge is scaling them up to the currently thousands of human disease associated genes, but you know, likely to be a huge fraction of all human disease genes.

Kat: And it does give me terrible flashbacks to my PhD when I was trying to knock-out like a couple of hundred base pairs next to one gene to try and find out what it did. And it took ages, and it didn't work... And now you are doing tens of thousands of these experiments at high throughput putting all these variant genes into cells and just going - does it do something bad, yes, no? If yes, then let's look at this. Automation and computation, it has just transformed biology in 20 years really?

Joe: Yeah it's really amazing with the computational power and the amount of variant data sets that we have and these emerging deep mutational scanning data sets, I think we're on the cusp of a revolution in terms of genetic diagnosis. But the crucial thing at the moment is how we use this data. You know, these experiments are very fresh and there's not really established guidelines about how we can use them in genetic diagnosis. And the same thing with the computational predictions. They're getting much better, but they're still viewed with a lot of skepticism by clinicians, rightfully so, because, you know, we shouldn't be making diagnosis just on the basis of a single computational prediction.

Joe: So the question is, what can we best do with this data? How can we best interpret it to make it the most practical use for clinicians? And so there's a lot of work going on at the moment in this, and I think we're again, about to dramatically improve the way that this information can ultimately aid genetic diagnosis.

Kat: There are lots of genes and there are thousands and thousands of variations and presumably lots of labs and groups trying to look at this, how is this data being organised and brought together?

Joe: Well, it's a difficult problem because you have all these groups around the world that are developing these experimental methodologies. But over the past couple years, an alliance has been formed. There's a group called the Atlas of Variant Effects Alliance that is trying to coordinate different researchers across the world who are doing these kinds of high throughput assays. And, for example, there's a database where people can register the targets that they're working on so that people will know not to focus on that protein, you know, so that they don't overlap.

Kat: This one's mine!

Joe: And they're sharing experimental methodologies and they're sharing computational approaches for analysing this data. And they're working on developing new standards for clinical interpretation of these high throughput experimental data. And so I think this is absolutely essential to building this atlas of variant effects that will ultimately be so important and so valuable to clinicians when it comes to improving genetic diagnosis.

Kat: That’s Dr Joe Marsh, from the MRC Human Genetics Unit at the Institute of Genetics and Cancer in Edinburgh.

Natalie Frankish: From a Diagnostic Odyssey to a Good Diagnosis

Natalie Frankish: From a Diagnostic Odyssey to a Good Diagnosis

Professor Zosia Miedzybrodzka: Expanding genetic testing for rare diseases

Professor Zosia Miedzybrodzka: Expanding genetic testing for rare diseases

0