Research

Background

Technological advances have continued to drive the study of biology towards the statistical and computational sciences. We are now able to differentiate and quantify biomolecules at levels previously unimaginable, allowing us to study their interactions and relationships to health and disease in an unbiased, systems-level manner.

A corollary of this transformation is that sophisticated quantitative models can now be used to tease out underlying biological insights and pathogeneses of molecular systems. In our view, an "organism" can be thought of as a system whose components are derived from its genome(s) and which interact with each other and the environment in a spatial and temporal manner. We think of these components (e.g. RNAs, proteins, mobilized DNA) as operating as part of networks with the other elements (e.g. metabolites, sunlight, micro-organisms). We therefore apply and develop concepts in graph theory, bioinformatics, epidemiology and biostatistics to understand how networks interact and what role they play in human diseases and traits.

Vision

Lead a world-class systems genomics research program focused on leveraging the latest genomic/biomolecular technologies and analytical techniques to alleviate the burden of diseases with immune and inflammatory aetiologies.

The Inouye Lab has a long track record of research at the interface of genomics, computer science and statistics. Overall, our aims are to utilise analytical tools to uncover insights into pathogenesis that change clinical practice; build local capacity via expansion of our research centre/ecosystem and synergistic degree programs; and champion Open Science, FAIR principles and rapid scientific publishing.

We operate in an extremely fast-moving research environment. The below is intended as a broad overview but for a better snapshot of the current depth and breadth of projects, please see our publications and preprints

Previous Research Highlights (up to 2015)

Understanding biomarkers for severe illness and mortality

A major research question of the upcoming precision medicine era is: can we utilize large-scale omics data to infer the biological processes underlying new biomarker associations? We are investigating various strategies to answer this question and have recently published a paper which identifies chronic low-grade inflammation, potentially a neutrophilic response to microbial insults, as a putative explanation for the GlycA biomarker's association with hospitalization and mortality, particularly from infections.

Individualized genomic risk and clinical utility

The clinical utility of genomic data has yet to be fully realized. We have been an early adoptor of machine learning methods for genomic prediction of phenotypes and have led the design of software tools (SparSNP) and the comparison of various competing methodologies to select the right approach for the right disease. As a proof-of-concept, we have shown that a genomic risk score for celiac disease (publicly available here) exhibits maximum prediction compared to other related approaches and demonstrates clinically useful information which can be used to remodel the celiac diagnostic pathway. We've also shown that targeting the high-risk HLA-DQ2.5 subgroup using genomic prediction may be an even more viable clinical strategy. For coronary heart disease, we've published a genomic risk score of 49,000 SNPs which can stratify the top 20% of men who are at high lifetime risk, leading to disease 12-18 years earlier than men at the bottom 20% of risk. These high-risk individuals could be candidates for early intervention. Towards this end, the genomic risk score for coronary heart disease is currently being trialled internationally in studies such as Finland's GENERISK to motivate individuals to change their lifestyle and traditional cardiovascular risk factors. Genomic prediction has profound ramifications for celiac disease, coronary heart disease and other complex diseases and, if properly developed, would likely have immensely positive implications for individual risk stratification and lifestyle intervention to halt pathogenesis in its earliest stages.

Host-pathogen interactions and genomic surveillance

Humans are composed of more microbial cells than human cells. It has been widely shown that the microbial communities that live on us (the human microbiome) have a significant impact on our health and well being, thus microbes are a key source of variation which will help to explain who gets disease and who does not. In close collaboration with Kat Holt's group and various external partners, we have shown that the airway microbiome is a determinant of respiratory disease in early life which raises the risk of subsequent asthma. Also in collaboration with Kat Holt's group, we are joining the fight against microbial pathogens and drug resistance by designing clinically useful algorithms and software tools (e.g. SRST2) to perform rapid typing, drug and virulence gene detection. SRST2 is now in use around the world at public health agencies and hospitals in the US, UK, Canada, China, and Australia. For more info, read the BMC interview.

Powerful phenotype networks, metabolism and atherosclerosis

GWAS typically consider only 1 phenotype at a time, despite statistical studies showing that multiple phenotypes in theory offer more power for locus detection. Paul de Bakker and Mike led a collaboration which set out to empirically demonstrate previous simulation studies. From serum metabolomics data in multiple cohorts, metabolite sub-networks were utilized in a multivariate GWAS framework to detect 7 novel metabolic loci. Using whole blood, adipose, liver and aortic tissues, we further connected the transcription of the top 2 genes (SERPINA1 and AQP9) to their inferred metabolites and to atherosclerotic plaques in mice and humans. The study was named a top advance of 2012 by the American Heart Association.

Integrative omics and a transcriptional sub-network for IgE signalling

While GWAS have been successful in identifying loci associated with traits and diseases, understanding how genotype connects to phenotype requires integration of intermediate levels of biomolecular information. To this end, Mike led a European collaboration which was the first to integrate more than 2 molecular systems in humans. Utilizing the DILGOM cohort and sophisticated network analyses, genomic profiles were integrated with whole blood transcriptomic and serum metabolomic profiles to discover a tightly expressed transcriptional sub-network (LL module). The LL module comprised key genes in IgE mediated inflammation and mast cell formation and was associated with blood levels of a wide range of lipids and small molecules. To date, the LL module has been replicated in various independent datasets and has been independently shown to be a key component of the blood transcriptome-metabolome interface. Furthermore, natural human loss-of-function variants have genetically confirmed the relationship between high affinity IgE receptor and circulating triglyceride levels.

First generation genome-wide association studies

The Sanger Institute was and still is a central hub and catalyst for genome-wide association studies in Europe. Through dozens of collaborations, Mike led primary analysis stages (e.g. genotyping, normalisation, quality control, imputation, signal inspection) for many first generation GWAS. From this research, >75 genetic loci for scores of diseases and traits were uncovered, with prime examples being TNFRSF11B and LRP5 for osteoporosis, MC4R for obesity, IL2/IL21 for celiac disease, and loci for the 7 complex diseases in the WTCCC. The International HapMap 3 Consortium, where Mike co-led the genotyping, integration and quality control of the main published dataset, set the scene for second generation GWAS of lower frequency genetic variants.

High-throughput genotyping with Illuminus

It used to be that determining genotypes from microarrays was done semi-manually, that is researchers used to sit for days/weeks at computers inspecting and fixing cluster plots like the one adjacent. A close collaboration between the Sanger Institute and Oxford University saw YY Teo, Taane Clark and Mike Inouye set out to design a rapid computational genotyping algorithm. The resulting approach and software, Illuminus, has since been utilized by groups worldwide and has formed a critical part of the analysis for dozens of seminal genome-wide association and population genetic studies. With a second collaboration to understand the effects of whole-genome amplification on GWAS, YY and Mike showed that both Illuminus and imputation strategies could be used to rescue the inevitable declines in genotyping performance, findings which were important both as an early deterrent of whole-genome amplification for GWAS and for studies where amplification was a necessity, such as MalariaGEN.