Inquidia Consulting Releases Parquet Output Plugin for Pentaho Data Integration

Share Article

New Component Helps Hadoop Users Easily Deploy Compressed Columnar Storage Format

Inquidia Consulting

Inquidia’s Parquet Plugin allows for manageable data absorption from essentially any data source into Hadoop as Parquet files including sources such as Salesforce, Excel, SAP, text files, databases, and many more.

Inquidia Consulting has released another data integration component to allow data to be easily written into Hadoop using the advanced Parqet data format. With Inquidia’s Parquet Output Plugin for Pentaho Data Integration, Hadoop users can now easily produce Parquet outputs for optimal query performance with data ingested into HDFS.

Parquet is a columnar file format used in Hadoop with built-in dictionary encoding, compression, and the ability to only read the columns of interest. The increasingly popular format was first released in 2013 and has gained traction with the Hadoop user base since its launch. This adoption has been driven by the massive compression and performance benefits of Parquet over a non-columnar file format such as text or Avro.

However, despite this rise in adoption, it remained challenging to ingest data into Hadoop in the Parquet format without broad, simple-to-use Parquet support across all Hadoop frameworks.

Inquidia Labs took on this challenge and developed the Parquet Output Plugin for Pentaho Data Integration. Chris Deptula, senior architect for the Inquidia Labs project, led development of the plugin which is currently available through the Pentaho PDI Marketplace and Github.

“In almost all Hadoop deployments Inquidia is working on, Parquet is being used in some form. Our clients want to be able to reap the performance benefits of Parquet without the long list of challenges to actually implementing it,” said Deptula. “Inquidia’s Parquet Plugin allows for a much more manageable data absorption from essentially any data source into Hadoop as Parquet files including sources such as Salesforce, Excel, SAP, text files, databases, and many more.”

Inquidia’s Parquet Output Plugin for Pentaho Data Integration is currently available for free in the Pentaho Marketplace, or available on github. To learn more about the plugin, read about it here. For more information about Inquidia Consulting’s services and capabilities, and our work with Pentaho, visit http://www.inquidia.com.

About Inquidia Consulting Inquidia is an innovative professional services firm delivering full spectrum data engineering and analytics services that help our customers inquire, learn and take action with their data. We are passionate about data. Find out more at http://www.inquidia.com.

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Kevin Haas
Inquidia
+1 (773) 980-6010
Email >
@inquidia
Follow >
Inquidia
Like >
Follow us on
Visit website