Tools

  • RGStraP

    • A pipeline to calculate genetic principal components from RNA sequencing data.

  • OMICSPRED

    • A resource for predicting multi-omics data using genotype data. Models trained on multi-omics cohorts, such as INTERVAL

  • Green Algorithms

    • An open, easy-to-use calculator for estimating the greenhouse gas emissions of any computation.

  • B4PPI

    • Benchmarking framework and software for protein-protein interaction networks

  • The Polygenic Score (PGS) Catalog

    • An open repository for published polygenic scores

  • PGS Catalog Calculator

    • A best-practice analysis pipeline for calculating polygenic scores on samples with imputed genotypes using existing scoring files from the PGS Catalog and/or user-defined PGS.

  • boostrapQTL

      • Multiple testing correction and accurate effect size estimation for expression and epigenetic QTLs

        • Citation: Huang QQ, Ritchie SC, Brozynska M, Inouye. Power, false discovery rate and Winner's Curse in eQTL studies. Nucleic Acids Research 2018.

  • NetRep

      • Rapid permutation-based assessment of the replication or preservation of networks. Designed with gene expression, protein, metabolite, and 16S OTU networks in mind).

        • Citation: Ritchie SC, Watts S, Fearnley LG, Holt KE, Abraham G, Inouye M. A scalable permutation approach for replication and preservation of network modules in large datasets. Cell Systems. 2016 3(1):71-82.

  • flashPCA2

      • Fast principal component analysis (PCA) of single nucleotide polymorphism (SNP) data; FlashPCA can perform PCA on >500,000 individuals at low-memory. It can now also perform efficient canonical correlation analysis (CCA).

        • Citation: Abraham G and Inouye M. Fast principal components analysis of large-scale genome-wide data. PLOS ONE. 2014. 9(4): e93766.

        • Citation: Abraham G, Qiu Y, Inouye M. Flashpca2: Principal components analysis of biobank-scale datasets. Bioinformatics. 2017 btx299.

  • SparSNP

    • A tool for fitting sparse penalized models to SNP data for the purposes of disease/phenotype prediction

      • Citation: Abraham G, Kowalczyk A, Zobel J, and Inouye M. SparSNP: Fast and memory efficient analysis of all SNPs for phenotype prediction. BMC Bioinformatics. 13:88 (2012)

      • Citation: Abraham G, Kowalczyk A, Zobel J, and Inouye M. Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease. Genetic Epidemiology. 2013 37(2):184-95.

    • Fused Multi-task Penalized Regression: a tool to generate sparse multivariate models of phenotype networks on a set of predictors simultaneously. Scales to 1,000s of phenotypes and 100,000-1,000,000 predictors.

      • Citation: Abraham G, Kowalczyk A, and Inouye M. A scalable and efficient approach for explicit modeling of large multi-omic networks. (submitted)

    • A popular genotype calling algorithm for the Illumina platform which fits 3-component, bivariate mixture models

      • Citation: Teo YY*, Inouye M*, Small KS, Gwilliam R, Deloukas P, Kwiatkowski DP, Clark TG. A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 23(20): 2741-2746. (2007)

    • Short-Read Sequence Typing (v2) takes Illumina sequence data, an MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and reports the presence of STs and/or reference genes and alleles.

      • SRST2 citation: Inouye M, Dashnow H, Raven L, Schultz M, Pope BJ, Tomita T, Zobel J, Holt KE. SRST2: Rapid genomic surveillance for public health and hospital microbiology labs. Genome Medicine. 2014.

      • SRST1 citation: Inouye M, Conway T, Zobel J, and Holt KE. Short Read Sequence Typing (SRST): multi-locus sequence types from short reads. BMC Genomics. 2012 Jul 24;13:338.

    • Extremely rapid, exhaustive and model-free approach for detecting epistatic interactions between SNPs

      • Citation: Goudey B, Rawlinson D, Wang Q, Shi F, Ferra H, Campbell RM, Stern L, Inouye M, Ong CS, Kowalczyk A. GWIS - Model-free, fast and exhaustive search for epistatic interactions in GWAS data. BMC Genomics. 2013 14(Suppl 3):S10.

    • Use infrequently or rarely observed heterozygote and homozygote genotypes to determine ethnic differences between individuals and within chromosomes

      • Citation: McGinnis RE, Deloukas P, McLaren WM, and Inouye M. Visualizing chromosome mosaicism and detecting ethnic outliers by the method of “rare” heterozygotes and homozygotes (RHH). Human Molecular Genetics 19(13):2539-53. (2010)