Charlotte defends

Huge congratulations to Charlotte Darby, who successfully completed her Ph.D. defense on May 26th! Charlotte’s thesis, titled “Computational methods addressing genetic variation in next-generation sequencing data” covers her work on Samovar (paper), scHLAcount (paper) and Vargas (paper), among other projects. Charlotte was co-advised by Ben and by Dr. Mike Schatz. Next, Charlotte will join Rahul…

Date
Categories
Tags
Comments

Published:June 28, 2020 View Post

Uncategorized

Comments closed

snapcount in Bioconductor

Led by Software Engineer Rone Charles and Ph.D. candidate Chris Wilks, we submitted an R/Bioconductor package called snapcount, which was accepted and is included in Bioconductor 3.11. snapcount makes it easy to query the powerful Snaptron server using a natural, accessible set of query functions. Specifically, you can query measurements for genes, exons, splice junctions…

Date
Categories
Tags
Comments

Published:May 4, 2020 View Post

Uncategorized

Comments closed

Vargas in Bioinformatics

Ph.D. candidate Charlotte Darby, extending work by former Masters student Ravi Gaddipati, published a study describing Vargas, a heuristic-free read alignment software tool. The study appeared in the journal Bioinformatics. The open source Vargas tool runs efficiently on modern SIMD and multithreaded architectures. By avoiding heuristics — rules that allow aligners to ignore certain portions of…

Date
Categories
Tags
Comments

Published:May 4, 2020 View Post

Uncategorized

Comments closed

r-index papers in JCB

Ph.D. candidate Taher Mun, together with Alan Kuhnle and co-authors, published a journal article and accompanying software article in the Journal of Computational Biology. We demonstrate new methods for text indexing and querying using the r-index, which represents an advance on earlier methods like the RLFM index and FM Index. This new method makes it…

Date
Categories
Tags
Comments

Published:May 4, 2020 View Post

Uncategorized

Comments closed

Reference flow preprint

Student Nae-Chyun Chen and colleagues just posted a new preprint describing his work on the “reference flow” alignment framework. Most sequencing data analyses start by aligning sequencing reads to a linear reference genome, made up of a single string per chromosome. But failure to account for genetic variation causes reference bias and confounding of results…

Date
Categories
Tags
Comments

Published:March 5, 2020 View Post

Uncategorized

Comments closed

FC-R2 in Genome Research

A new study describing our FC-R2 (for: “FANTOM-CAT recount2”) resource is out in Genome Research. FC-R2 is a new quantification of the recount2 summaries using the more inclusive annotation produced by the FANTOM CAGE-Associated Transcriptome (FANTOM-CAT) project. This annotation consists of over 109,000 coding and noncoding genes. By combining this annotation with the recount2 resource,…

Date
Categories
Tags
Comments

Published:February 21, 2020 View Post

Uncategorized

Comments closed

ASCOT in Nature Comms

The ASCOT study appeared in Nature Communications today. ASCOT is a new resource allowing researchers to visualize and query alternative splicing patterns in public RNA-Seq data. The resource is freely available at ascot.cs.jhu.edu. To populate ASCOT, we used Snaptron to identify splice-variants across tens of thousands of bulk and single cell RNA-Seq datasets in human…

Date
Categories
Tags
Comments

Published:January 9, 2020 View Post

Uncategorized

Comments closed

Vargas preprint

Ph.D. student Charlotte Darby and former Masters student Ravi Gaddipati posted a preprint describing their work on Vargas, a heuristic-free read alignment software tool that runs efficiently on modern SIMD and multithreaded architectures. Heuristics are rules that allow aligners to ignore certain portions of the search space that seem to contain only low-scoring alignments. Avoiding…

Date
Categories
Tags
Comments

Published:December 21, 2019 View Post

Uncategorized

Comments closed

Dashing in Genome Biology

The Dashing study, “Dashing: fast and accurate genomic distances with HyperLogLog,” authored by Daniel Baker appeared in Genome Biology today. Dashing is a fast and accurate software tool for estimating similarities of genomes or sequencing datasets. It uses the HyperLogLog sketch together with cardinality estimation methods that specialize in set unions and intersections. Dashing sketches…

Date
Categories
Tags
Comments

Published:December 4, 2019 View Post

Uncategorized

Comments closed

Kraken 2 in Genome Biology

Our paper describing the Kraken 2 software tool for metagenomic read classification appeared in Genome Biology. Kraken 2’s memory usage is several-fold smaller than Kraken 1’s. It is also about 5-fold faster and adds a translated search mode for better sensitivity when classifying viruses.  Like Kraken, Kraken 2 works well with the Bracken tool for…

Date
Categories
Tags
Comments

Published:November 29, 2019 View Post

Uncategorized

Comments closed