Data Engine Module

WPS Engine for Hadoop

Hadoop Big Data Environments

Mapr logoCloudera certified

Hadoop is an ecosystem of storage and processing components that provide a scalable, fault-tolerant, software framework for the distributed storage and processing of very large datasets on computer clusters.

WPS is able to operate with third party Hadoop big data environments including the major distributions Cloudera, Hortonworks, MapR and native Apache Hadoop. WPS is certified for use with Cloudera version 5 and later.

Hadoop Engine

The WPS Engine for Hadoop provides access to Hive and Impala data sources in a Hadoop environment via standard or pass through SQL.

Type of Access Supported?
Reading Yes
Writing Yes
Updating Yes
Creating New Tables Yes
Implicit Pass Through Support Yes
Explicit Pass Through Support Yes
Bulk Loading No

The WPS engine for Hadoop connects to a Hadoop cluster using the JDBC interface.

Interoperating with Hadoop Big Data Environments

The WPS Interop for Hadoop module provides additional language support for interoperating with a Hadoop environment. This includes a FILENAME statement for direct HDFS connections and a HADOOP procedure for executing Pig and MapReduce commands directly within a Hadoop cluster.

Dependencies and Usage

The WPS Engine for Hadoop can only be used on the supported platforms indicated in the table below.

Please enable JavaScript to get the best out of our website. You will be unable to do certain things without it, including logging in, registering or filling out our contact forms.