Advertisment

Greenplum to support MapReduce

author-image
CIOL Bureau
Updated On
New Update

CALIFORNIA, USA: Greenplum, a provider of database software for the next generation of data warehousing and analytics, has announced support for MapReduce within its massively parallel database engine.

Advertisment

MapReduce is the parallel computing technique pioneered by Google for analyzing the web, which will allow to derive deeper insights from their own data. Early adopters of the technology include LinkedIn and O'Reilly Media.

MapReduce has been proven as a technique for high-scale data analysis by Internet leaders such as Google and Yahoo. Greenplum provides the best of both worlds - MapReduce for programmers and SQL for DBAs - and will execute both MapReduce and SQL directly within Greenplum's parallel dataflow engine, which is at the heart of the Greenplum Database.

"On its own, MapReduce is a powerful tool for data manipulation and analysis. Companies that are integrating MapReduce and SQL are increasing its applicability and giving developers and DBAs the ability to work together on a common parallel data processing infrastructure," said Curt Monash, president, Monash Research and editor of the influential blog DBMS2.

Advertisment

Greenplum customers have been involved in an early-access program utilizing Greenplum MapReduce for advanced analytics. For example, LinkedIn is using Greenplum Database for new, innovative social networking features such as "People You May Know" and sees Greenplum MapReduce as a way to develop compelling analytics products faster. A primary benefit of the new capability is that customers can combine SQL queries and MapReduce programs into unified tasks that are executed in parallel across hundreds or thousands of cores.

"Greenplum has seamlessly integrated MapReduce into its database, making it possible for us to access our massive dataset with standard SQL queries in combination with MapReduce programs," said Roger Magoulas, research director, O'Reilly Media. "We are finding this to be incredibly efficient because complex SQL queries can be expressed in a few lines of Perl or Python code."

"Greenplum has assembled some of the best and brightest database and distributed systems experts to build the parallel data processing technology that is at the heart of Greenplum Database. The introduction of MapReduce into our product means that customers will immediately have a wide range of new capabilities for their massive-scale data analytics, something we are uniquely qualified to bring to market," said Scott Yara, co-founder and president, Greenplum.

tech-news