You can create a connection to Apache Hive using a connection profile.
Context
The Apache Hive data warehouse software facilitates querying and managing large data sets
residing in distributed storage. Hive provides a mechanism to project structure onto
this data and query the data using a SQL-like language called HiveQL. At the same time
this language also allows traditional map/reduce programmers to plug in their custom
mappers and reducers when it is inconvenient or inefficient to express this logic in
HiveQL.
Procedure
- Select a connection profile. In SP204, new connection profiles are:
- rs_oracle_to_hive.sql - defined for replication from Oracle to
Hive.
- rs_ase_to_hive.sql - defined for replication from ASE to Hive.
- Set new system environment variables for Hive:
- RS_HIVE_TEMP_FILE_PATH – specifies the file path for the Hive temporary file. By default,
it is set to the working directory of Replication Server. Ensure that
the file path can be accessed by both Replication Server and Hive.
- RS_HIVE_AUTH_TYPE – specifies the authentication type used by Hive. It has only two
case-insensitive values: <SASL> and
<NoSASL>. By default, the value for this
environment variable is <SASL> (Simple Authentication
and Security Layer), which works for most Hive servers.
In addition to setting the system variables for Hive, note the following change in the
existing system environment variables; this change is implemented by the
Replication Server installer automatically.
- Library paths for Boost, Thrift, SASL, and OpenSSL are added to the LD_LIBRARY_PATH.
These 3rd-party libraries are included in the Replication Server
release package, and you can refer to them from these Replication
Server installation
directories:
$SYBASE/REP-15_5/lib3p64/boost/lib
$SYBASE/REP-15_5/lib3p64/thrift/lib
$SYBASE/REP-15_5/lib3p64/cyrussasl/lib
$SYBASE/REP-15_5/lib3p64/openssl/lib