Cambridge Semantics Shatters Previous Record of Loading and Querying ‘Trillion Triples’ by 100X

Share Article

In Data Analytics Breakthrough, Smart Data Company® Establishes Key Milestone for Loading and Querying Big Diverse Data in 1.98 Hours Versus 220 Hours

News Image
The largest enterprises can now act with speed and agility in their data integration and analytics no matter the data volume, offering a clear competitive advantage.

Cambridge Semantics, the leading provider of graph-based Smart Data management and exploratory analytic solutions, today announced that its Anzo Graph Query Engine™ completed a load and query of one trillion triples as a Google Cloud Partner on the Google Cloud Platform in just under two hours, 100 times faster than the previous solution running the Lehigh University Benchmark (LUBM) at the same data scale.

The LUBM is an industry standard that evaluates the query performance of semantic web repositories over a large data set. A ‘triple’ consists of a subject, predicate and an object. In the LUBM test conducted on the Google Cloud Platform on Oct. 31, 2016, Cambridge Semantics’ Anzo Graph Query Engine was able to load and query 1.065 trillion triples in 1.98 hours, surpassing the previous LUBM benchmark of 220 hours set by Oracle in September 2014.

To place the data results in context, examples of one trillion triples includes:

  •     six months of worldwide Google searches
  •     133 facts for each of the 7 billion people on earth
  •     100 million facts describing all the details of each of 10,000 clinical trial studies
  •     156 facts about each device connected to the internet

“A key challenge for semantic-based analytics has been enabling a load and query performance on very large data sets from a data lake in timeframes that offer an acceptable user experience,” said Barry Zane, vice president of engineering at Cambridge Semantics. “With the LUBM results, it’s been validated that a loading and query process that once took over a month’s worth of business hours can now be completed in less than two hours.”

“As modern data diversity and volumes grow, relational database management systems (RDBMS) are proving too inflexible, expensive and time-consuming for enterprises,” Zane said. “This benchmark record set by our Anzo Graph Query Engine signals a paradigm shift where graph-based online analytical processing (GOLAP) will find a central place in everyday business by taking on data analytics challenges of all shapes and sizes, rapidly accelerating time-to-value in data discovery and analytics.”

Cambridge Semantics’ Anzo Graph Query Engine is a clustered, in-memory graph analytics engine based on open semantic standards that enables users to develop ad hoc and interactive queries and analytics across very large interconnected rich data sets. The platform can be deployed behind the enterprise firewall on dedicated enterprise servers or, as in the case of this LUBM, provisioned automatically on cloud infrastructures such as the Google Cloud Platform.

“The LUBM results demonstrate that our Anzo Graph Query Engine can handle diverse data at big data scale while maintaining security, provenance and governance,” said Alok Prasad, president of Cambridge Semantics. “The largest enterprises can now act with speed and agility in their data integration and analytics no matter the data volume, offering a clear competitive advantage. Many of our customers are already solving problems they couldn’t address before by exploiting our ability to offer end users a way to automatically query relationship-rich, diverse data in new and unexpected ways.”

Cambridge Semantics is part of the Google for Work Partner Program as a Google Cloud Platform Technology Partner which allows the company to extend its robust smart data solutions to organizations that wish to leverage Google Cloud Platform for a flexible and large-scale data platform.

About Cambridge Semantics
Cambridge Semantics (CSI), the Smart Data Company, is an enterprise smart data management and exploratory analytics company. It enables customers and partners to rapidly build and deploy Smart Data Lake solutions based on its award-winning Anzo Smart Data Platform™ (Anzo SDP).

IT departments and business users gain better understanding and data value through the semantic linking, analysis and management of diverse data whether internal or external, structured or unstructured. The Anzo Smart Data Lake solutions are delivered with increased speed, at big data scale and at a fraction of the implementation costs of using traditional approaches.

The company is based in Boston, Massachusetts.

For more information visit http://www.cambridgesemantics.com or follow us on Facebook, LinkedIn and Twitter: @CamSemantics.

# # #

Share article on social media or email:

View article via:

Pdf Print