Show TOC

Creating a Connection to HiveLocate this document in the navigation structure

You can create a connection to Apache Hive using a connection profile.

Context

The Apache Hive data warehouse software facilitates querying and managing large data sets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

Procedure

  1. Select a connection profile. In SP204, new connection profiles are:
    • rs_oracle_to_hive.sql - defined for replication from Oracle to Hive.
    • rs_ase_to_hive.sql - defined for replication from ASE to Hive.
  2. Set new system environment variables for Hive:
    • RS_HIVE_TEMP_FILE_PATH – specifies the file path for the Hive temporary file. By default, it is set to the working directory of Replication Server. Ensure that the file path can be accessed by both Replication Server and Hive.
    • RS_HIVE_AUTH_TYPE – specifies the authentication type used by Hive. It has only two case-insensitive values: <SASL> and <NoSASL>. By default, the value for this environment variable is <SASL> (Simple Authentication and Security Layer), which works for most Hive servers.
    In addition to setting the system variables for Hive, note the following change in the existing system environment variables; this change is implemented by the Replication Server installer automatically.
    • Library paths for Boost, Thrift, SASL, and OpenSSL are added to the LD_LIBRARY_PATH. These 3rd-party libraries are included in the Replication Server release package, and you can refer to them from these Replication Server installation directories:

      $SYBASE/REP-15_5/lib3p64/boost/lib

      $SYBASE/REP-15_5/lib3p64/thrift/lib

      $SYBASE/REP-15_5/lib3p64/cyrussasl/lib

      $SYBASE/REP-15_5/lib3p64/openssl/lib