Hadoop is an ecosystem of storage and processing components that provide a scalable, fault-tolerant, software framework for the distributed storage and processing of very large datasets on computer clusters.
WPS is able to operate with third party Hadoop big data environments including the major distributions Cloudera, Hortonworks, MapR and native Apache Hadoop. WPS is certified for use with Cloudera version 5 and later.
The WPS Engine for Hadoop provides access to Hive and Impala data sources in a Hadoop environment via standard or pass through SQL.
|Type of Access||Supported?|
|Creating New Tables|
|Implicit Pass Through Support|
|Explicit Pass Through Support|
The WPS engine for Hadoop connects to a Hadoop cluster using the JDBC interface.
The WPS Interop for Hadoop module provides additional language support for interoperating with a Hadoop environment. This includes a FILENAME statement for direct HDFS connections and a HADOOP procedure for executing Pig and MapReduce commands directly within a Hadoop cluster.
The WPS Engine for Hadoop can only be used on the supported platforms indicated in the table below.