Split Sequence Bloom Trees

The Experiment Discovery Problem Public databases such at the NIH Sequencing Read Archive (SRA) now contain hundreds of thousands of short-read sequencing experiments. A major challenge now is making that raw data accessible and useful for biological analysis — researchers must be able to find the relevant and related experiments…

Continue reading

Armatus – Topological Domain Finder


Recent chromosome conformation capture experiments have led to the discovery of dense, contiguous, megabase-sized topological domains that are similar across cell types and conserved across species. These domains are strongly correlated with a number of chromatin markers and have since been included in a number of analyses. However, functionally-relevant domains…

Continue reading

Sailfish RNA-seq Quantification


RNA-seq expression estimates need not take longer than a cup of coffee The quantification of gene or isoform abundance is a fundamental step in many transcriptome analysis tasks, such as determining differential expression between biological samples. Yet, estimating isoform abundance from a large set of RNA-seq reads remains a computationally…

Continue reading

Sequence Bloom Trees


Querying a short read database for a transcript of interest is a fundamental problem in biology. Yet such queries are computationally intensive and scale linearly with the size of the data being searched. This leads to a computational bottleneck in which large databases of sequencing reads are compiled but never…

Continue reading