Software and Algorithms

Coordinated Gene Activity in Pattern Sets (CoGAPS)

Bayesian Non-negative matrix algorithm. Used for pattern identification algorithm from genomics data. Implemented in the R/Bioconductor package CoGAPS.

GSReg and Expression Variability Analysis (EVA)

EVA compares the dissimilarity between genomic profiles in sample groups. EVA can be applied for pathway dysregulation and splice variation.  EVA is implemented in the R/Bioconductor package GSReg, along with the permutation-based DIRAC algorithm.


The CancerInSilico R/Bioconductor package implements a mathematical model of cellular growth in R.

Permuted Surrogate Variable Analysis (pSVA)

pSVA removes technical artifacts in known batches from high-throughput data while preserving inter-sample heterogeneity for machine learning and unsupervised analysis. The algorithm is implemented in the R/Bioconductor package SVA


The ProjectR R/Bioconductor package enables transfer learning of patterns learned with unsupervised learning from source bulk and single cell genomics data to new target genomics datasets. Applicable to CoGAPS, NMF, clustering, and other unsupervised learning methods.