InfiniDB Shows Off Open-Source SQL-on-Hadoop Query Engine at OSCON and Tackles the GDELT Big Data Set

Share Article

Presentation at OSCON discusses how InfiniDB is a SQL- on- Hadoop solution that fills the gap of limited capability from many SQL-based reporting tools in real-time ad-hoc analytics for Big Data.

InfiniDB Logo

InfiniDB Logo

The ability to discover, analyze, learn and predict from these immense databases can be constrained if the performance is not matched with a Big Data analytic query engine such as InfiniDB.

InfiniDB announced today its participation and upcoming presentation at OSCON, July 20-24 in Portland, Oregon. InfiniDB will demonstrate its open-source high performance analytic database that combines MySQL ease of use with a scalable architecture that can run on-premise, in the cloud, or natively in Hadoop to deliver real time SQL in Hadoop.

In a presentation from Jim Tommaney, CTO of InfiniDB, “Predicting Global Unrest with GDELT and SQL on Hadoop,” Tommaney will explore with participants how the InfiniDB platform is empowering a new wave of possibilities to perform analysis to discover political, educational and business insights around the world.

“There are an increasing number of global data sets that are being created, assembled and exponentially growing as more data triggers, devices and pools of information are assembled,” commented Tommaney. “But the ability to discover, analyze, learn and predict from these immense databases can be constrained if the performance is not matched with a Big Data analytic query engine such as InfiniDB. InfiniDB takes Big Data stored in Hadoop, like the GDELT database, and provides familiar and direct SQL query access to the data for fast job execution against an arbitrarily large cluster.”

The Global Database of Events, Language, and Tone (GDELT) - is an initiative to construct a catalog of human societal-scale behavior and beliefs across all countries of the world. GDELT is designed to help support new theories and descriptive understandings of the behaviors and driving forces of global-scale social systems from the micro-level of the individual through the macro-level of the entire planet by offering real-time synthesis of global societal-scale behavior into a rich quantitative database allowing real-time monitoring and analytical exploration of those trends.

Tommaney’s presentation on Wednesday, July 23 at 10:40 am in room D135, will discuss how many data applications are presented with challenges once the data is put into action. These include data quality issues and data skew that span across different dimensions such as time and location that can hinder access to the underlying data patterns. Using the GDELT data set as a model, Tommaney will share a practical example of working through data readiness challenges. This starts by using a combined Hadoop and SQL-for-Hadoop architecture to prepare the data, and then leverage the performance capabilities of InfiniDB for Hadoop to explore the data and deliver analytic insights.

Tommaney added, “We will be getting into some of the nitty-gritty details of data preparedness by looking at the GDELT data set. There are some general guidelines that organizations can use for working in a combined Hadoop + SQL-for-Hadoop environment. There are also some huge performance gains that can be achieved when exploring the capabilities of InfiniDB for SQL-on-Hadoop to deliver data insights.”

More information on InfiniDB will be available in the company’s booth # Booth X where attendees can enter a drawing with a chance to win a GoPro Hero3 White Edition. A preview of Tommaney’s talk is at . To stay up to date on the leading open-source SQL-on-Hadoop query engine, follow @InfiniDB or visit

Tweet this: News: @InfiniDB Tackles #GDELT #BigData with #OpenSource #SQL on #Hadoop query engine. @OSCON w @InfiniDB #CTO @jtommaney

About InfiniDB
InfiniDB empowers organizations to solve problems and create new solutions with powerful Big Data analytics. The company’s platform is a fourth-generation massive parallel processing (MPP) column-oriented data technology that is known for its rapid implementation, simplicity and extraordinary value. InfiniDB, InfiniDB for the Cloud, and InfiniDB for Apache™ Hadoop® are built for today’s growing enterprise. These organizations demand speed, scale and efficiency in their analytics platforms where leveraging traditional and emerging data technologies, structures and architectures are required. InfiniDB products are licensed as GPL-2.0 with complementary consulting services, maintenance and support agreements.

For more information, to join the community, and download software, visit and follow @InfiniDB.


Share article on social media or email:

View article via:

Pdf Print

Contact Author

Mark Peterson
Peterson Communications
+1 (831) 626-4400
Email >
Visit website