Greenplum's Internet-Scale Database Gets Big Analytics Boost with New Version

Share Article

Greenplum Database 3.2 first to integrate MapReduce; New features offer significant performance enhancements for large-scale analytical processing

Greenplum, a leading provider of database software for the next generation of data warehousing and analytics, today announced general availability of Greenplum Database 3.2, the latest version of the company's high-performance database software. Greenplum Database 3.2 is the first database to include MapReduce, the parallel computing technique pioneered by Google for analyzing the web, giving Greenplum customers a wide range of powerful new capabilities for massive-scale data analytics. Greenplum Database 3.2 also introduces powerful in-database compression, adds programmable parallel analytics capabilities and offers enhanced graphical database monitoring.

Key new features in Greenplum Database 3.2 include:

  • Greenplum MapReduce: MapReduce has been proven as a technique for high-scale data analysis by Internet leaders such as Google and Yahoo; with Greenplum 3.2, this capability is available for the enterprise. Customers can now combine SQL queries and MapReduce programs into unified tasks that are executed in parallel across hundreds or thousands of cores.
  • In-Database Compression: Greenplum Database 3.2 introduces in-database compression using industry-leading compression technology. This allows customers to increase performance and dramatically reduce the space required to store their data. Customers can expect to see a 3-10x space reduction with increased effective I/O performance to match.
  • Programmable Parallel Analytics: Greenplum Database 3.2 unleashes a new level of parallel analysis capabilities for mathematicians and statisticians. For the first time customers can use the statistical language R, or build custom functions using our linear algebra and machine learning primitives, and run them in parallel directly against data in the database.
  • Enhanced Database Monitoring: Greenplum Database 3.2 offers a powerful new GUI and infrastructure for monitoring database performance and usage. These new capabilities seamlessly gather, store and present comprehensive details about database usage and current and historical queries internals, down to the iterator level, making this ideal for profiling queries and managing system utilization.

With Greenplum Database 3.2 featuring MapReduce, Greenplum is delivering on the promise of parallel computing and the promise of multicore computing efficiency. While most software applications struggle to utilize the parallelism of multiple cores, Greenplum's massively parallel, shared-nothing architecture fully utilizes each core, with linear scalability to 1000s of processors. This makes Greenplum the only open source-powered database software that can scale to support the demands of petabyte data warehousing. Greenplum also makes scalability cost-effective: Greenplum's standards-based approach enables customers to build high-performance data warehousing systems on low-cost commodity hardware, offering a new, disruptive economic model for large-scale analytics.

"Greenplum has cracked the code on multicore," said Scott Yara, president and co-founder of Greenplum. "This is the beginning of a major shift in the way companies are using commodity hardware to tackle some of the world's toughest analytical challenges against massive data sets. Greenplum Database 3.2 introduces new capabilities that make possible advanced analytics on unprecedented volumes of data."

Greenplum customers are deriving significant benefits from parallel analytics capabilities. For example, LinkedIn is using Greenplum Database to enable its "People You May Know" function. Fox Interactive Media is also using Greenplum Database to provide complex, real-time analysis in support of its advanced targeted advertising systems. O'Reilly Media is using Greenplum Database to pour over billions of records compiled from blogs and job postings to predict which new technologies are about to enter the mainstream.

"Modern data in all its forms -- including unstructured information, meta data, and results complex events and transactions -- demand tools that can do for enterprise BI what Google has done for Web indexing, search and analytic," said Dana Gardner, principal analyst, Interarbor Solutions. "Greenplum is advancing its parallel database with new technologies such as MapReduce to help meet the needs for Internet-scale data inference gathering on a business outcomes level."

About Greenplum
Greenplum is a data infrastructure company that is reinventing how companies gain insight and competitive advantage from their data. The company's flagship product, Greenplum Database, is built to support the next generation of data warehousing and large-scale analytics processing. Supporting SQL and MapReduce parallel processing, Greenplum Database offers industry-leading performance at a low cost for companies managing terabytes to petabytes of data. Greenplum Database is used by major global organizations including Fox Interactive Media, Nasdaq, NYSE Euronext, Reliance Communications, Skype and LinkedIn. Greenplum partners with Sun Microsystems to power the Sun Data Warehouse Appliance. For more information visit


Share article on social media or email:

View article via:

Pdf Print

Contact Author

Leyl Black

Paul Salazar
Visit website