SDSU Uses Terascala-Powered HPC Storage Appliance to Study the DNA of Viruses and Bacteria

Share Article

Bioinformatics research is using Big Data to explore human health, disease, and our environment.

News Image

Bioinformatics Research Using Big Data

To break this bottleneck, we purchased a Terascala-powered HPC storage appliance that balances our storage requirements with our computational infrastructure. --Christopher Paolini, staff scientist at San Diego State University

Terascala, the industry leader in High Performance Computing (HPC) storage management software, today announced that its TeraOS software powers the Lustre®-based parallel storage appliance at the Computational Science Research Center (CSRC) data center at San Diego State University (SDSU). The Terascala® TeraOS software keeps the HPC storage running at the high throughput required by SDSU researchers using bioinformatics to study, process, and correlate biological data.

SDSU researchers are contributing to the Human Microbiome Project, where the National Institutes of Health collects viral and bacterial samples from healthy people and sequences the DNA. SDSU researchers then investigate to find meaning in the DNA sequences of the viruses and bacteria. Purpose-built computational tools are used to identify patterns in the DNA strings, as well as classify the unknown sequences. These tools also check if these DNA sequences are found in other organisms or people, and what contributing factors may have significance, including geography and environment.

In addition to the human microbiome, SDSU researchers collect samples of microorganisms from a variety of environments including coral reefs, fields, soil, and even the kelp forests of San Diego. Using these samples, researchers are trying to understand how widespread microbes are in populations, their diversity, how they function, and their purpose. Innovative techniques developed at SDSU resulted in the discovery of a new virus that infects bacteria in a person’s intestinal tract. Astonishingly, this virus is present in approximately three quarters of the world’s population.

“The biological sciences are really becoming data driven and require large amounts of compute and high performance storage,” explained Rob Edwards, professor of computer science at San Diego State University. “We couldn’t explore new ways of thinking about biological data, human health, disease, and our environment without large computational resources. Our bioinformatics research is using Big Data to answer biological interesting questions that may improve our quality of life.”

“Big Data simulations are often I/O bound and consume a significant amount of time just reading and writing data. To break this bottleneck, we purchased a Terascala-powered HPC storage appliance that balances our storage requirements with our computational infrastructure,” noted Christopher Paolini, staff scientist at San Diego State University. “The plug and play nature of the appliance has allowed our researchers to concentrate on the science instead of system administration.”

“Big Data requires large storage and operating on it requires fast storage,” said Alan Swahn, Terascala's vice president of marketing. “The Terascala solution brings these attributes together in a turnkey appliance that is a perfect fit for the demands of bioinformatics.”

About Terascala
Terascala—the High Performance Computing (HPC) storage management company—has pioneered software that lowers total cost of ownership by managing and optimizing data, performance, and reliability. Savings are quickly realized through minimizing downtime, moving data very fast between scratch and inexpensive backup storage, and optimizing workload throughput. The company’s TeraOS software provides system-level high availability to reduce costly degraded performance and downtime. It integrates workload, network, storage, and file system monitoring, analysis, pre-emptive failure alerts, and fast failover. Terascala’s real-time analysis and phone-home support keep these highly complex systems up and running at peak performance. Support costs are minimized with no HPC or parallel file system expertise required.

Connect with Terascala
Follow us on Twitter:
Learn more at

Terascala and the Terascala logo are trademarks or registered trademarks of Terascala, Inc. All other brands, products, or service names may be trademarks or property of their respective holders.

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Alison Golan

Alan Swahn
508-588-1501 220
Email >
Visit website