Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Complete the requirements and instructions below prior to installing the CDAP components.

Software Prerequisites

You'll need this software installed:

  • Java runtime on each CDAP node and Hadoop datanode.

  • A Hadoop, HBase, Hive (and optionally Spark) environment to run against.

  • To use the ad-hoc querying capabilities of CDAP, ensure the cluster has a compatible version of Hive installed. See the section on Hadoop Compatibility.

  • If Hive is not going to be installed, you will need to disable the CDAP Explore Service, as by default it is enabled. The installation instructions describe how to configure this.

  • CDAP nodes require Hadoop and HBase client installation and configuration. Note: No Hadoop services need actually be running.

  • We recommend installing an NTP (Network Time Protocol) daemon on all nodes of the cluster, including those with CDAP components.

Java Runtime

The latest JDK or JRE version 1.8.xx for Linux, Windows, or Mac OS X must be installed in your environment; we recommend the Oracle JDK.

To check the Java version installed, run the command:

Code Block
$ java -version

CDAP is tested with both the Oracle JDK and the Open JDK; it may work with other JDKs but it has not been tested with them.

Once you have installed the JDK, you'll need to set the JAVA_HOME environment variable.

NTP (Network Time Protocol)

Installing NTP on RPM using Yum

  1. Install the NTP service and dependencies:

    Code Block
    $ sudo yum install ntp ntpdate ntp-doc
    
  2. Set the service to start at reboot:

    Code Block
    $ sudo chkconfig ntpd on
    
  3. Start the NTP server. This will continuously adjust the system time from an upstream NTP server:

    Code Block
    $ sudo /etc/init.d/ntpd start
    
  4. Synchronize the system clock with the 0.pool.ntp.org server. You should use this command only once:

    Code Block
    $ sudo ntpdate -u pool.ntp.org
    
  5. Synchronize the hardware clock (to prevent synchronization problems), unless on a virtual server:

    Code Block
    $ sudo hwclock --systohc
    

Installing NTP on Debian using APT

  1. Install the NTP service and dependencies:

    Code Block
    $ sudo apt-get install ntp
    
  2. Start the NTP server. This will continuously adjust the system time from an upstream NTP server:

    Code Block
    $ sudo service ntp start
    
  3. Synchronize the system clock with the 0.pool.ntp.org server. You should use this command only once:

    Code Block
    $ sudo ntpdate -u pool.ntp.org
    
  4. Synchronize the hardware clock (to prevent synchronization problems), unless on a virtual server:

    Code Block
    $ sudo hwclock --systohc
    

NTP Troubleshooting and Configuration

  • To check the synchronization:

    Code Block
    $ ntpq -p
    
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    +173.44.32.10    18.26.4.105      2 u    5   64    1   78.786   -0.157   1.966
    *66.241.101.63   132.163.4.103    2 u    7   64    1   43.085    2.872   0.409
    +services.quadra 198.60.22.240    2 u    6   64    1   21.805    3.040   1.033
    -hydrogen.consta 200.98.196.212   2 u    7   64    1  114.250   16.011   0.873
    
  • If you need to adjust the configuration (add or delete servers, use servers closer to you, etc.):

    Code Block
    $ vi /etc/ntp.conf
    

CDAP and Firewalls

In general, your cluster configuration cannot have a firewall between the cluster and CDAP. Instead, if a firewall is used, the cluster and certain CDAP components need to be together behind the firewall. These are the ports which can be opened to provide external access:

...