Kitware Aims Big with Project to Extend System for Data Management

Share Article

Project addresses rapidly expanding needs of the data and analytics field.

The workflow interface in GoBig combines tasks for different computing architectures. In this case, GoBig writes a word-count frequency in Scala with Spark, and it writes a bar plot in R.

GoBig offers analysts in academic, governmental, and commercial settings with a flexible and extensible solution for managing Big Data.

Kitware announced plans to kick off development tomorrow on a Phase II Department of Energy (DOE) Small Business Innovation Research project to simplify Big Data management. For the project, Kitware looks to extend GoBig, its open-source software system that provides Big Data analysts with access to a range of computing resources through a unified interface.

“The computational power of the current technological landscape is unprecedented, and Big Data is more complex than ever,” Jeff Baumes, an assistant director of scientific computing at Kitware, said. “GoBig offers analysts in academic, governmental, and commercial settings with a flexible and extensible solution for managing Big Data.”

GoBig enables analysts to employ today’s most advanced tools and libraries, yet it does not require that analysts learn several programming languages. The modular design of the system streamlines installation and significantly reduces the costs that come with maintaining multiple computing environments.

“GoBig is a self-contained system, so it is easy to set up and configure,” Baumes said. “In addition, as GoBig is open source, it does not impose licensing fees.”

GoBig leverages high-performance computing (HPC) and high-throughput computing (HTC) resources, and it builds on components of the Resonant environment. These components include Girder, which provides GoBig with accesses to data from multiple storage systems, and Romanesco, which allows GoBig to manage workflows with different programming languages and compute engines.

“With GoBig, analysts no longer need to stick to just one type of data across their respective workflows,” Baumes said. “They are free to harness the power of multiple systems regardless of whether or not these systems share a common technology stack.”

During Phase I, the development team at Kitware created the GoBig code repository, which is available on GitHub. The team also tested the GoBig system on use cases in climate analysis and business intelligence. These tests demonstrated the successful execution of GoBig as an end-to-end system.

For Phase II, the team will equip GoBig to deploy on the Web. The team will also integrate new technologies with GoBig including ParaViewWeb, which can run parallel visualization jobs. This integration will widen the scope of GoBig to fields such as simulation and modeling.

“The performance of GoBig is encouraging,” Baumes said. “We found that a major advantage of the system is that it uses both HPC and HTC resources, as some scenarios may favor one type of resource over the other.”

In developing GoBig through open practices, Kitware is furthering its commitment to promote collaboration and scientific discovery. Since 1998, Kitware has created and supported open-source solutions including CMake, the Visualization Toolkit (VTK), ParaView, and the Insight Segmentation and Registration Toolkit (ITK). To learn more about the commitment of Kitware to open source and to discover how to utilize its products and services, please contact kitware(at)kitware(dot)com.

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Acquisition & Assistance, under Award Number DE-SC0013252.

This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

About Kitware
Kitware is an advanced technology, research, and open-source solutions provider for research facilities, government institutions, and corporations worldwide. Founded in 1998, Kitware specializes in research and development in the areas of HPC and visualization, medical imaging, computer vision, data and analytics, and quality software process. Among its services, Kitware offers consulting and support for high-quality software solutions. Kitware is headquartered in Clifton Park, NY, with offices in Carrboro, NC; Santa Fe, NM; and Lyon, France. More information can be found on

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Sandy McKenzie
Follow us on
Visit website