Improving metagenomics workflows in KBase
The Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) Science Focus Area (SFA), managed by Dr. Paul D. Adams and led by Dr. Adam P. Arkin at Lawrence Berkeley National Laboratory (LBNL), aims to advance knowledge of microbial biology and better understand effects of the microbial community on the organisms and abiotic factors around them across multiple scales. ENIGMA studies microbial communities, from the single molecule to the ecosystem scale, to unlock how microbial communities impact ecosystems and how environmental conditions impact their ecosystem function.
The goal is to develop a mechanistic understanding of microbial community formation and activity in complex environments. To do this, ENIGMA is exploring interactions between microbial communities in ground water and sediment at various locations within the Oak Ridge Field Research Site. The site has well-mapped hydrology and geology, as well as complex gradients of nutrients, stressors, and contaminants. This provides a robust framework to explore spatial and temporal impacts of environmental factors on microbial communities.
Collaborative Development Projects
Long read assembly of microbial isolates and metagenomes*
Dr. Lauren Lui is lead Co-I on a project for building pipelines for long read assembly of microbial isolate genomes and metagenomes in KBase, alongside John-Marc Chandonia and Torben Nielsen. Learn more about Dr. Lui’s research in this Berkeley Lab Basics-2-Breakthroughs video, and how she uses KBase in our Community Highlight.
Functionality and Tools:
- Unicycler: Assembles Illumina-only read sets (SPAdes-optimiser), long-read-only sets (PacBio or Nanopore), but for the best possible assemblies, do a hybrid assembly with both Illumina reads and long reads. (Released)
- Polypolish: Uses short read alignments to repair long-read assembly errors, typically repeat sequences. (Beta-only)
- Filtlong: Filters a long read set using read length (longer is better) and read identity (higher is better). (Beta-only)
Reference-based metagenome workflow^
Functionality and Tools:
- Meta-Decoder Call Variants: Identify polymorphisms within microbiome read sequences against reference genomes (Released)
- Map Reads to Reference Sequence: Map reads to a reference genome (Released)
- Call Microbial SNPs: identify single nucleotide polymorphism (SNP) variants based on mapped reads (Released)
- StrainFinder v1: Determine strain genome sequences from sub-populations of allele frequency (Released)
- FamaProfiling: Generate a functional profile of nitrogen cycle genes or universal single-copy markers for metagenomic read libraries and assembled genomes.
- Fama Read Profiling (Released)
- Fama Genome Profiling (Released)
Meet the Members
SFA Principal Investigator: Paul D. Adams1,2
SFA Technical Co-Manager, Science Lead: Adam P. Arkin1,2,3
SFA Contact: Astrid Terry1
KBase Contact: Dylan Chivian1,3, Elisha Wood-Charlson1,3
Lead Developers: John-Marc Chandonia1,3, Alexey Kazakov1, Lauren Lui1, Torben Nielsen1, Anni Zhang4
Project Team Lead: Eric J. Alm4^, Lauren Lui1*
Affiliations: 1Lawrence Berkeley National Laboratory, 2University of California at Berkeley, 3KBase, 4Massachussetts Institute of Technology
- ENIGMA Project Website
- Functional and Taxonomic Profiling of MAGs (7 July 2021) – KBase tutorial webinar