A resource for predicting multi-omics data using genotype data. Models trained on multi-omics cohorts, such as INTERVAL
An open repository for published polygenic scores
Multiple testing correction and accurate effect size estimation for expression and epigenetic QTLs
Citation: Huang QQ, Ritchie SC, Brozynska M, Inouye. Power, false discovery rate and Winner's Curse in eQTL studies. Nucleic Acids Research 2018.
Rapid permutation-based assessment of the replication or preservation of networks. Designed with gene expression, protein, metabolite, and 16S OTU networks in mind).
Citation: Ritchie SC, Watts S, Fearnley LG, Holt KE, Abraham G, Inouye M. A scalable permutation approach for replication and preservation of network modules in large datasets. Cell Systems. 2016 3(1):71-82.
Fast principal component analysis (PCA) of single nucleotide polymorphism (SNP) data; FlashPCA can perform PCA on >500,000 individuals at low-memory. It can now also perform efficient canonical correlation analysis (CCA).
Citation: Abraham G and Inouye M. Fast principal components analysis of large-scale genome-wide data. Bioinformatics. 2017.
Citation: Abraham G, Qiu Y, Inouye M. Flashpca2: Principal components analysis of biobank-scale datasets. Bioinformatics. 2017 btx299.
A tool for fitting sparse penalized models to SNP data for the purposes of disease/phenotype prediction
Citation: Abraham G, Kowalczyk A, Zobel J, and Inouye M. SparSNP: Fast and memory efficient analysis of all SNPs for phenotype prediction. BMC Bioinformatics. 13:88 (2012)
Citation: Abraham G, Kowalczyk A, Zobel J, and Inouye M. Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease. Genetic Epidemiology. 2013 37(2):184-95.
Fused Multi-task Penalized Regression: a tool to generate sparse multivariate models of phenotype networks on a set of predictors simultaneously. Scales to 1,000s of phenotypes and 100,000-1,000,000 predictors.
Citation: Abraham G, Kowalczyk A, and Inouye M. A scalable and efficient approach for explicit modeling of large multi-omic networks. (submitted)
A popular genotype calling algorithm for the Illumina platform which fits 3-component, bivariate mixture models
Citation: Teo YY*, Inouye M*, Small KS, Gwilliam R, Deloukas P, Kwiatkowski DP, Clark TG. A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 23(20): 2741-2746. (2007)
Short-Read Sequence Typing (v2) takes Illumina sequence data, an MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and reports the presence of STs and/or reference genes and alleles.
SRST2 citation: Inouye M, Dashnow H, Raven L, Schultz M, Pope BJ, Tomita T, Zobel J, Holt KE. SRST2: Rapid genomic surveillance for public health and hospital microbiology labs. Genome Medicine. 2014.
SRST1 citation: Inouye M, Conway T, Zobel J, and Holt KE. Short Read Sequence Typing (SRST): multi-locus sequence types from short reads. BMC Genomics. 2012 Jul 24;13:338.
Extremely rapid, exhaustive and model-free approach for detecting epistatic interactions between SNPs
Citation: Goudey B, Rawlinson D, Wang Q, Shi F, Ferra H, Campbell RM, Stern L, Inouye M, Ong CS, Kowalczyk A. GWIS - Model-free, fast and exhaustive search for epistatic interactions in GWAS data. BMC Genomics. 2013 14(Suppl 3):S10.
Use infrequently or rarely observed heterozygote and homozygote genotypes to determine ethnic differences between individuals and within chromosomes
Citation: McGinnis RE, Deloukas P, McLaren WM, and Inouye M. Visualizing chromosome mosaicism and detecting ethnic outliers by the method of “rare” heterozygotes and homozygotes (RHH). Human Molecular Genetics 19(13):2539-53. (2010)