Introducing Collections for Data Exploration
KBase is excited to debut a new feature for exploring data called Collections. Collections are curated datasets that allow researchers to search for genomes by similarity and select data of interest to analyze in their Narratives. You can browse genomes and samples, identify matches to your genomes, and copy the data into your Narratives within the Collections interface. Search through Collections by sequence similarity or with GTDB taxonomy terms to identify hits closest to your genomes. Additionally, you can explore the functional characteristics of each Collection with ecological traits from microTrait within the interface.
With this initial beta release, we are presenting four Collections for users to explore:
-
The Genome Taxonomy Database collection contains data of the famous reference dataset used for standardizing microbial taxonomy based on genome phylogeny.
-
The Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) collection presents genomes of subsurface microbiomes from a DOE legacy site at the Oak Ridge Reservation contaminated by production of nuclear materials.
-
The Plant Microbe Interfaces (PMI) collection of plant-host associated microbial genomes, representing the taxonomic and functional diversity of mutual interactions between organisms in the rhizosphere.
-
The Genome Resolved Open Watersheds (GROW) collection, a publicly available genome database representing river microbiomes from across the world.
Beta Testing
Collections is a new, beta release feature that’s unique to the KBase platform and you can help us improve it by submitting any issues or queries to our Help Board. While we are not yet enabling users to build and release their own Collections at this time, we encourage you to let us know of any reference datasets that you would like to explore in the future. Reach out to us at engage@kbase.us with any suggestions or inquiries about Collections.