MetaStorm: A Computational System for Customizable Metagenomic Analysis

Lenwood Heath gave the Clavius Distinguished Lecture in the Department of Computer and Information Science at Fordham University in New York City on Thursday, November 10.  Dr. Heath is a professor in the Department of Computer Science at Virginia Tech.

MetaStorm is a joint research project with Liqing Zhang, associate professor in the Department of Computer Science at Virginia Tech, and Amy Pruden, professor in Civil & Environmental Engineering at Virginia Tech.  Metagenomics is the capturing of microbial DNA sequence from environmental samples, where the environment might be soil, the ocean, or the human gut, for example.  Especially important is that such samples contain multiple species of organisms, so the DNA sequence collected originates from a variety of unknown sources.  Modern DNA sequencing results in a large number of short sequences, called reads, that are typically analyzed by using each read as a query to search a large database of known DNA or protein sequences.  The accumulated results of such searches indicate what biological entities are present in the sample and what biological functions (proteins) are performed by organisms in the sample.  Existing computational pipelines to perform these searches and subsequent analyses tend to be inflexible.  We have developed a user-customizable analysis pipeline called MetaStorm that promises to support more targeted investigations of metagenomic data sets.  MetaStorm provides the capability of assembling the reads into longer sequences, called contigs, that allow more precise identification of sequence matches.  Also, MetaStorm allows the user to provide her own specialized sequence database to guide the search for particular classes of genes, for example, antibiotic resistance genes.  MetaStorm is available as a free Web service where users upload their metagenomic data sets, select the desired analyses, and visualize the results in several novel ways.

Dr. Heath’s research interests include theoretical computer science, algorithms, graph theory, computational biology, and bioinformatics.  Dr. Heath completed a Ph.D. in computer science at the University of North Carolina, Chapel Hill, an M.S. in mathematics at the University of Chicago, and a B.S. in mathematics at the University of North Carolina, Chapel Hill.  Before joining the faculty at Virginia Tech in 1987, he was an instructor of applied mathematics and member of the Laboratory of Computer Science at MIT.  He has supervised 11 computer science PhD students to completion and currently supervises 8 computer science graduate students.  He has worked on a number of computational biology and bioinformatics projects funded by the National Science Foundation, including the current Beacon project, which captures, represents, infers, and simulates signal transduction pathways in plants.  Other projects involve computational genomics, motif finding, and machine learning.  Work in metagenomics is a natural fit to his interests in computational genomics.

 

Dr. Heath
Dr. Heath