recount2 in Nature Biotech

The recount2 study appeared today in Nature Biotechnology. recount2 provides processed and summarized expression data for over 70,000 human RNA-seq samples from the Sequence Read Archive (SRA), The Cancer Genome Atlas (TCGA), and The Genotype-Tissue Expression (GTEx) project. The associated Bioconductor package provides a convenient API for querying, downloading, and analyzing the data. Each processed study consists of meta- and phenotype data, the expression levels of genes and their underlying exons and splice junctions, and corresponding genomic annotation. The work was led by Leonardo Collado-Torres and Abhinav Nellore. A Docker- and Jupyter-notebook-based interface provided by the JHU-based SciServer project allows users to work with the recount2 data from an R prompt without downloading any of it. This was joint work with the Leek group and other collaborators at JHU, including Kasper Hansen and Andrew Jaffe.


Published:April 11, 2017


