A knowledgebase for predictive biology

A knowledgebase for predictive biology

KBase enables users to analyze, share, and collaborate using data and tools designed to help build increasingly realistic models for biological function.

What is KBase? What is KBase?

What is KBase?

The Department of Energy Systems Biology Knowledgebase (KBase) is a knowledge creation and discovery environment designed for biologists and bioinformaticians.

KBase integrates a variety of data and analysis tools from the DOE and other public services into an easy-to-use platform that leverages scalable computing infrastructure and performs sophisticated systems biology analyses.

As a freely available and developer extensible platform, KBase enables scientists to analyze their own data within the context of public data and share findings across the system. Users can perform large-scale analyses that combine multiple types of ‘omics data to investigate organisms and their communities and drive discovery.

Data Sharing

KBase supports the sharing of data, workflows, and Narratives, facilitating collaboration and accelerating the pace of scientific discovery. The ultimate goal is to build a true knowledgebase for systems biology: an integrated environment where knowledge and insights are created and multiplied.

Meet the KBase Team

Meet the KBase Team

KBase is run by an interdisciplinary and collaborative team led by Lawrence Berkeley National Laboratory with participation from Argonne, Brookhaven, and Oak Ridge National Laboratories.

Also involved in the multi-institutional program are Cold Spring Harbor Laboratory, the University of Illinois at Urbana-Champaign, and the University of Tennessee.

Our key external partners are DOE’s Joint Genome Institute, Environmental Molecular Sciences Laboratory, Bioenergy Research Centers, and several of the Genomic Science Program’s Scientific Focus Areas (SFAs). Several university projects are also important contributors.

Meet the KBase Team
Adam Arkin
Adam Arkin | Lead Pl
Lawrence Berkeley National Laboratory

Adam is an expert in the comparative systems and synthetic biology of microbes and is dedicated to a model-driven approach to experimental science. He is a senior faculty scientist in the Environmental Genomics and Systems Biology Division at the Lawrence Berkeley National Laboratory and he is the Dean A. Richard Newton Memorial Professor of Bioengineering at the University of California, Berkeley where he has been since 1998. He is Technical Co-Manager of the ENIGMA SFA and directs the Center for Utilization of Biological Engineering in Space. He was one of six recipients of the 2013 Ernest Orlando Lawrence Award, the Department of Energy’s highest scientific honor.

Chris Henry
Chris Henry | Pl
Argonne National Laboratory

Chris is a scientist at Argonne National Laboratory, a fellow at the University of Chicago, and an adjunct professor at Northwestern University. He is an expert in computational biology with a focus on the prediction of phenotype from genome through the use of comparative genomics, metabolic modeling, and dynamic cellular community models. He received the Jay Bailey Young Investigator Best Paper in Metabolic Engineering Award in 2012.

Bob Cottingham
Bob Cottingham | Pl
Oak Ridge National Laboratory

Bob has extensive experience developing computational and data management tools and systems for genetics, genomics and systems biology research with a background in bioinformatics and management including at the Baylor College of Medicine Human Genome Center as Co-Director of the Informatics Core, Operations Director of the Genome Database at Johns Hopkins University School of Medicine, and Vice President of Computing at Celltech Chiroscience, a UK biopharmaceutical company developing drugs based on gene targets. In 2008 Cottingham moved to Oak Ridge National Laboratory where he is Group Leader for Computational & Predictive Biology.

Elisha Wood-Charlson
Elisha Wood-Charlson | Engagement Lead
Lawrence Berkelely National Laboratory

Elisha M Wood-Charlson is KBase’s User Engagement Lead. She has a PhD and 10+ years of experience as a microbial ecologist focused on host-microbe-virus interactions in the marine environment. Since leaving the research bench, she has moved into the realm of scientific community engagement, with the goal of making microbiome data science more efficient through effective collaboration, building trust in online communities, and developing shared ownership throughout the scientific process.

Paramvir Dehal
Paramvir Dehal | Science Lead
Lawrence Berkelely National Laboratory
Shane Canon
Shane Canon | Architect Lead
Lawrence Berkelely National Laboratory

Shane Canon is a project engineer in the Data and Analytics Services group at NERSC at Lawrence Berkeley National Lab and is a senior member of the KBase project where he co-leads advanced development.   Shane has focused his career on enabling data-intensive applications on HPC platforms and more recently on leverage HPC and large scale computing to enable bioinformatics. Shane has held a number of positions at NERSC including leading the Technology Integration group, where he focused on the Magellan Project and other areas of strategic focus, and leading the Data Systems group which managed the global file systems and other data systems.  Shane has also served as a group leader at Oak Ridge National Laboratory, where he architected the 10 petabyte Spider filesystem. Shane holds a PhD in physics from Duke University and a BS in physics from Auburn University. 

Steve Chan
Steve Chan | DevOps Lead
Lawrence Berkelely National Laboratory

Steve Chan is an engineer who has worked in academia (CMU, Stanford) and many tech startups during the original Dot Com boom before coming to Berkeley Lab. He’s a generalist, having worked as a backend and frontend developer, operations, cybersecurity, systems programming and E-commerce, line developer and management. At KBase he spends his time wearing management and backend/systems engineering hats.

Oliver Fiehn
Oliver Fiehn | Director West Coast Metabolomics Center
UC Davis Genome Center

Prof. Oliver Fiehn has pioneered developments and applications in metabolomics with over 220 publications to date, starting in 1998 as postdoctoral scholar and from 2000 onwards as group leader at the Max-Planck Institute for Molecular Plant Physiology in Potsdam, Germany. Since 2004 he is Professor at the UC Davis Genome Center, overseeing his research laboratory and the satellite core service laboratory in metabolomics research. Since 2012, he serves as Director of the NIH West Coast Metabolomics Center, supervising 35 staff operating 16 mass spectrometers and coordinating activities with three UC Davis satellite labs, including efforts for combined interpretation of genomics and metabolomics data. The West Coast Metabolomics Center provides the most extensive and most in-depth analysis of metabolites available today, using a range of validated protocols for fee-for-service projects and scientific collaborations.

Professor Fiehn’s research aims at understanding metabolism on a comprehensive level in human population cohorts, animal and plant models, and cells and microorganisms. In order to leverage data from these diverse sets of biological systems, his research laboratory focuses on standardizing metabolomic reports and establishing metabolomic databases and libraries, for example the MassBank of North America that hosts over 200,000 public metabolite mass spectra and BinBase, a resource of over 90,000 samples covering more than 1,900 studies. Professor Fiehn’s laboratory members develop and implement new approaches and technologies in analytical chemistry for covering the metabolome, from increasing peak capacity by ion mobility to compound identifications through cheminformatics workflows and software. He collaborates with a range of investigators for interpreting metabolomic data in human diseases through statistics, text mining and pathway-based mapping efforts. He also studies fundamental biochemical questions from metabolite damage repair to the new concept of epimetabolites, the chemical transformation of primary metabolites that gain regulatory functions in cells.

For his work, Professor Fiehn has received a range of awards including the 2014 Molecular & Cellular Proteomics Lecture Award and the 2014 Metabolomics Society Lifetime Achievement Award. He served on the Board of Directors of the Metabolomics Society from 2005-2010 and 2012-2015, organizing a range of workshops and conferences, including the 2011 Asilomar metabolomics meeting and the 2015 Metabolomics Society international conference in San Francisco that reached a record of over 1,000 participants.

Melissa A. Haendel
Melissa A. Haendel | Director of Translational Data Science
Linus Pauling Institute

My vision is to fundamentally alter the fabric of biomedical science, utilizing my art as a data translator to weave together healthcare systems, basic science research, and patients; through development of data integration technologies, innovative communication strategies, and collaborative education and outreach.

My demonstrated success in leadership of cross- ­disciplinary international teams, development of applications used for rare disease diagnostics, implementation of platforms and tools for translational research, and open and reproducible science will serve me and the community at large to effect real change.

Stephen P. Long
Stephen P. Long | Professor
University of Illinois

Dr. Long’s research bioengineers the photosynthesis process in crops to achieve higher productivity, sustainability, and adaptation to climate change. He heads an international project to improve the crops that feed many of the poorest in the world, which has led to the discovery of a way to engineer photosynthesis that resulted in a 20% increase in crop productivity.

Lee Ann McCue
Lee Ann McCue | Computational Scientist
Pacific Northwest National Laboratory

Dr. McCue’s early research focused on the development and application of comparative genomics methods for studies of transcription regulation in bacteria. Her research pioneered the phylogenetic footprinting approach for predicting transcription factor binding sites de novo using multiple bacterial genomes, and contributed to a number of algorithmic advances to the Gibbs sampling technique for multiple sequence alignment. Her research at PNNL has expanded to include the analysis of microbial populations and metagenomes for microbial ecology studies, and the application of high performance computing techniques to handle the quantity of data being generated by current sequencing technologies.

Nirav Merchant
Nirav Merchant | Director, UA Data Science Institute
University of Arizona

Nirav Merchant is the Co-PI for NSF CyVerse(link is external) a national scale Cyberinfrastructure for life sciences and (link is external)NSF Jetstream(link is external) the first user-friendly, scalable cloud environment for NSF XSEDE.

He received his undergraduate degree in Industrial engineering from the University of Pune, India, and graduate degree in Systems and Industrial Engineering from the University of Arizona (1994).

Over the last two decades his research has been directed towards developing scalable computational platforms for supporting open science and open innovation, with emphasis on improving research productivity for geographically distributed interdisciplinary teams.

His interests include data science literacy, large-scale data management platforms, data delivery technologies, managed sensor and mobile platforms for health interventions, workforce development, and project based learning.

Julie C Mitchell
Julie C Mitchell | Director of Biosciences
Oak Ridge National Laboratory

Julie Mitchell is Director of the Biosciences Division at Oak Ridge National Laboratory. She has over 20 years of experience in working at the interface of quantitative and biological sciences. Mitchell’s research has focused on projects at the interface of biochemistry, data science, and high-performance computing. Her contributions to the field of computational biophysics emphasize the use of machine learning in predictive models for molecular interactions. Mitchell’s group has produced a widely utilized web server for protein-protein interaction hot spots (>80k jobs), many well-cited publications and two patents. She collaborates on ORNL projects related to protein intrinsic disorder, small molecule screening algorithms, and vaccine design.

Prior to joining ORNL, Mitchell worked as a professor of mathematics and biochemistry at the University of Wisconsin and as a principal scientist at the San Diego Supercomputer Center at UCSD. Mitchell earned a Ph.D. in mathematics at the University of California at Berkeley, and a B.A. in mathematics at San Jose State University. Mitchell was a Sloan Foundation Fellow, La Jolla Interfaces in Science Fellow and ARCS Foundation Fellow during her faculty, postdoctoral and graduate years, respectively.

Daniel Segrè
Daniel Segrè | Professor
Boston University

We develop theoretical approaches and computational models for the study of complex biological networks. We are especially interested in the dynamics and evolution of metabolism, whose complex web of small-molecule transformations underlies fundamental aspects of biological organization, from energy transduction to cell-cell communication. In addition to helping understand how biological systems function and evolve, we seek to apply our methods to the design and optimization of engineered networks for bioenergy and biomedicine applications.

Rick Stevens
Rick Stevens | Associate Laboratory Director
Argonne National Laboratory

Rick Stevens is Argonne’s Associate Laboratory Director for Computing, Environment and Life Sciences.

Stevens has been at Argonne since 1982, and has served as director of the Mathematics and Computer Science Division and also as Acting Associate Laboratory Director for Physical, Biological and Computing Sciences. He is currently leader of Argonne’s Exascale Computing Initiative, and a Professor of Computer Science at the University of Chicago Physical Sciences Collegiate Division. From 20002004, Stevens served as Director of the National Science Foundation’s TeraGrid Project and from 19972001 as Chief Architect for the National Computational Science Alliance.

Stevens is interested in the development of innovative tools and techniques that enable computational scientists to solve important large-scale problems effectively on advanced scientific computers. Specifically, his research focuses on three principal areas: advanced collaboration and visualization environments, high-performance computer architectures (including Grids) and computational problems in the life sciences. In addition to his research work, Stevens teaches courses on computer architecture, collaboration technology, virtual reality, parallel computing and computational science.

Susannah Tringe
Susannah Tringe | Deputy of User Programs
Joint Genome Institute

Dr. Tringe joined the JGI in 2003 as a postdoctoral fellow in Eddy Rubin’s group. During her postdoctoral tenure she developed methods for comparative analysis of metagenome data from complex microbial communities. In 2006 she took a research scientist position providing scientific support for the developing portfolio of collaborator metagenome projects, and in 2010 she became head of the Metagenome Program. Dr. Tringe also heads the Microbial Systems Group, whose work focuses on sequence-based approaches to studying microbial community assembly, function and dynamics.  Major foci of these research efforts are the roles of microbial communities in wetland carbon cycling and the interactions of plants with their associated microbiomes. Dr. Tringe also serves as JGI’s Deputy of User Programs.

Kelly Wrighton
Kelly Wrighton | Associate Professor of Soil Microbiomes
Colorado State University

The Wrighton laboratory is a microbiome research group interested in the study of microorganisms, their genomes, and the surrounding environment. We investigate how microorganisms contribute to ecosystem processes, with a particular interest in carbon and nitrogen cycling. Our microbial research has many applications, including improving predictions of greenhouse gas emission from soils, stabilizing gastrointestinal and heart health, and enhancing energy yield and longevity from hydrocarbon systems. We integrate data from different scales, from the metabolite through genes/enzymes, to organisms, and ultimately microbial communities, to better understand microbial interactions with the biotic and abiotic environment. Please check out our laboratory webpage for more detailed information on our current research.

Looking for information on tools and resources?

Looking for information on tools and resources?

Check out KBase Documentation for our Getting Started guide and information on tools in the App Catalog. KBase is a fully open source software project available on GitHub.