IMPC researchers have proposed a new gene classification system, Full Spectrum of Intolerance to Loss-of-function (FUSIL), which can be used to identify genes associated with disease.
Identifying which genes are linked to a rare disease is one of the most difficult challenges geneticists face. The low prevalence of these diseases within the population makes it difficult to research and fully understand their causes. Large advances in the identification of genetic mutations in rare disease patients are now being made due to developments in sequencing technology. However, it is still very difficult to determine the causative mutation among the dozens typically identified per patient, as we don’t know the functions for many of these mutated genes.
The IMPC is addressing this challenge by producing and studying knockout mice – mouse strains that have been genetically altered by switching a single gene off. The IMPC then assesses each mouse for ‘phenotype changes’ (physical and chemical abnormalities) to gain insight into the function of each gene.
In a new paper, IMPC researchers demonstrate how the most common phenotype observed – embryonic lethality that is observed for 1/3rd of genes tested – can be used to help human geneticists prioritise genes that could be candidates for rare disease association.
A New Categorisation System
The system classifies genes based on two major factors: organismal viability and cellular viability. Viability (or the lack thereof) allows researchers to categorise whether a gene is essential or not for a cell’s or organism’s survival. If a gene isn’t essential, then a cell or organism can still live without it functioning. If it is, then the cell or organism won’t be able to survive without a functioning version of the gene.
“Loss of gene function is often referred to as a binary concept; lethal or viable,” says Violeta Muñoz-Fuentes, Biologist, Mouse Informatics at EMBL-EBI. “In this study, we show that gene essentiality is more of a spectrum ranging from cellular lethal, developmental lethal, subviable, viable with a visible phenotype, and viable without a visible phenotype.”
|Cellular Lethal (CL)||Genes essential for cell viability|
|Developmental Lethal (DL)||Genes essential for organism development|
|Subviable (SV)||Organism survival is less than expected|
|Viable, with significant phenotypes (VP)||Organisms fully develop with abnormal physical changes|
|Viable, with no significant phenotypes (VN)||Organisms fully develop with no abnormal physical changes|
Researchers found that genes listed as viable and subviable in the IMPC database were almost always listed as non-essential in the Project Achilles database. Those that were essential were almost always in the CL and DL categories.
Diving Deeper into the Data
As a first step to understand the differences between CL and DL genes, researchers performed enrichment analysis tests to see what kind of biological processes each category was more associated with. They found that CL genes were linked to important cellular processes, like DNA replication and cell division, whilst DL genes were linked to important developmental processes, like embryo development and symmetry.
The researchers also found that Mendelian disease genes (diseases caused by a mutation in a single gene) were most likely to be in the DL category, as well as a higher proportion of autosomal-dominant disease genes – where a mutation in only one of the two copies that a gene has is sufficient to cause disease. The DL category also had higher numbers of genes associated with early-onset diseases and affecting multiple bodily systems. Genes in the DL category are therefore a good target for further analyse for associations to disease.
Leading by Example
To illustrate how using the FUSIL system could work, the authors searched three large rare disease sequencing programs for unsolved diagnostic cases. They narrowed findings down to 9 genes that were in the DL category, had been flagged in the databases of rare disease programmes (DDD, 100KGP, CMG, denovo-dp) and were not highly tolerant to mutation, and therefore more likely to be associated with disease. Two examples are VPS4A and TMEM63B.
VPS4A had no previous reported disease links but had been detected in two unsolved cases in the 100KGP database and one in the CMG database. These patients were intellectual disability cases, with symptoms like developmental delay, delayed motor development and eye abnormalities.
The IMPC produced mice with this gene ‘knocked out’, meaning genetically switched off so it would no longer function, which produced phenotypes like smaller embryos, abnormal brain development and spine curvature. CT scans showed further evidence of brain abnormalities. VPS4A is also known to interact with another gene, CHMP1A, which, if mutated, is known to cause ‘pontocerebellar hypoplasia type 8’, a neurodevelopmental disorder with similar phenotypes to the unsolved VPS4A cases.
TMEM63B, like VPS4A, is also extremely intolerant to mutation but is not listed as being associated with any disease. One case on the DDD database and four in the 100KGP database list a TMEM63B mutation. These were also intellectual disability cases, with phenotypes of abnormal movement and brain morphology. IMPC knockout mice for this gene had phenotypes like abnormal behaviour and hyperactivity.
These findings indicate that there is a high probability that mutated versions of these genes are associated with developmental disease.
The diagnostic rate of large-scale, rare disease sequencing programmes is between 20% to 40%. The majority of rare disease patients remain undiagnosed due to a lack of detection or because a previously unknown gene is disrupted, and undiagnosed patients often suffer from physical, social and financial costs. This study furthers our understanding of rare disease genes by providing clinicians and researchers with an open-access resource which can be used to identify high-quality candidate genes, the mutations of which could cause a rare disease. With this, clinicians and researchers can more readily link candidate genes to previously undiagnosed cases, opening up future research possibilities. “Of particular interest for application to healthcare, we demonstrate that the set of genes that are essential for organism development is particularly associated with known human developmental disorders,” says Damian Smedley, Reader in Computational Genomics at Queen Mary University of London. “This provides candidates for undiscovered causative genes for these conditions.”
CACHEIRO, P., et al. (2020). Human and mouse essentiality screens as a resource for disease gene discovery. Nature Communications. Published online 31 01; DOI: 10.1038/s41467-020-14284-2