Bring tens to hundreds of data files into a Narrative in a single import!
Oct 20, 2022

Team Spotlight – Data Upload Project

More data? No problem!

KBase has always made it easy to upload and import data into the Narrative. Now our recent efforts to enhance this process introduces the ability to import multiple files (10s – 100s) of assorted data types into a Narrative using a single import cell. This bulk import is a new feature that decreases the clutter of many import cells in the Narrative and saves you a ton of time by streamlining the data import process. Read more about the recent features and abilities in our bulk import release blog. 

I think the bulk upload functionality is a quantum leap from the old importing process, which involved so much focused clicking to import a lot of data files. It’s also snappy and just a pleasure to use. – Sumin Wang, Software Engineer

A team of KBase software engineers and staff trained in user experience research and design, from Argonne National Laboratory (ANL), Lawrence Berkeley National Laboratory (LBNL), and Oak Ridge National Laboratory (ORNL), developed the new import workflow. Gavin Price (LBNL), a senior backend engineer, oversaw the team as the product owner and one of the delivery managers. Developers worked on both frontend or forward-facing components of the user interface (UI) and backend software connecting the platform to the project servers. 

User-shaped design

How does a software team re-envision a functionality that already works, but needs to be better, faster, stronger? The first stage of the project was to support the selection of multiple files of different data types in a single import. The data importing infrastructure handles each data type and file individually, so it was challenging to combine these processes into a single step and find a way that works best for users. This is where design and user feedback (from people like you!) comes in. 

At multiple stages, Zach Crockett (ORNL) and Ellen Dow (LBNL) conducted user experience interviews to understand how KBase Users import their data while interacting with the new workflow and interface using interactive mockups. 

Something challenging for me was realizing I had to get rid of my own notions and think as if I were an end user. One example would be figuring out how to make the import specifications discoverable and usable – in testing, users interacted with it in an entirely different way than what I expected. – Zach Crockett, Design/UX

Through feedback from users, frontend developers AJ Ireland and Bill Riehl (LBNL) enabled users to select multiple files in the Staging Area. Another exciting feature includes autodetection, which suggests or pre-selects the data type for you! 

Side-by-side comparison of an outdated version of the KBase Staging Area next to the updated user interface on the right. Side-by-side comparison of a previous version (left) of the Staging Area next to the updated user interface (right).

Backend developers Sumin Wang (LBNL) and Boris Sadkhin (ANL) worked on the core service which handles the actual data import. Sumin worked on connecting species identification databases for some data import types, such as a GenBank genome. Boris acted as the glue behind the scenes, making sure the new features integrated with the existing system.

What about importing hundreds of files?

Once the user interface was able to support the expanded import capacity, we realized there was still a lot of clicking to import hundreds or even dozens of files. There needed to be some method to select all those files in one go. We developed an Import Specification file using CSV, TSV or Excel to import hundreds of files at once. Read about how to use the Import Specification Template in our docs

This is game changing. It removes the need for users to select every file they wish to import. 

What is the team most excited about? 

This new data import feature should offer users a big boost in productivity, especially when importing 100s of data files. What does this mean for future KBase processes? The team hopes that it will lead to more multi-selection abilities for running analysis Apps and continue to streamline your analyses. 

The Data Upload team would also like to recognize Steve Chan (LBNL) for his work on the Data Upload project as the delivery manager prior to joining the Joint Genome Institute during Fall 2021.

Ellen Dow
Ellen Dow
Lawrence Berkeley National Laboratory

Ellen G. Dow, Ph.D. is a member of the outreach, communications, and user development team. Inspired by involvement in science outreach throughout graduate school, she left the bench to gain experience in informal education and cultivate community engagement from the general public to science sectors. A molecular biologist by training, Ellen applies her research experience to support scientists and develop resources for the KBase community.