WPS Engine for Hadoop

Data Engine Modules

WPS Engine for Hadoop

Hadoop Big Data Environments

Hadoop is an ecosystem of storage and processing components that provide a scalable, fault-tolerant, software framework for the distributed storage and processing of very large datasets on computer clusters.

WPS is able to operate with third party Hadoop big data environments including the major distributions Cloudera, Hortonworks, MapR and native Apache Hadoop. WPS is certified for use with Cloudera version 5 and later.

Hadoop Engine

The WPS Engine for Hadoop provides access to Hive and Impala data sources in a Hadoop environment via standard or pass through SQL.

Type of Access Supported?
Reading Yes
Writing Yes
Updating Yes
Creating New Tables Yes
Implicit Pass Through Support Yes
Explicit Pass Through Support Yes
Bulk Loading No

The WPS engine for Hadoop connects to a Hadoop cluster using the JDBC interface.

Interoperating with Hadoop Big Data Environments

The WPS Interop for Hadoop module provides additional language support for interoperating with a Hadoop environment. This includes a FILENAME statement for direct HDFS connections and a HADOOP procedure for executing Pig and MapReduce commands directly within a Hadoop cluster.

Dependencies and Usage

The WPS Engine for Hadoop can only be used on the supported platforms indicated in the table below.

Platform Supported?
AIX on IBM PowerYes
Linux on ARMYes
Linux on IBM Power LE (Little Endian)Yes
Linux on x86Yes
macOS on x86Yes
Windows on x86Yes
z/OS on an architecture 7 machineNo

Other data engine modules