...
Complete the requirements and instructions below prior to installing the CDAP components.
Software Prerequisites
You'll need this software installed:
A Java runtime on each CDAP node and Hadoop datanode.
A Hadoop, HBase, Hive (and optionally Spark) environment to run against.
To use the ad-hoc querying capabilities of CDAP, ensure the cluster has a compatible version of Hive installed. See the section on Hadoop Compatibility.
If Hive is not going to be installed, you will need to disable the CDAP Explore Service, as by default it is enabled. The installation instructions describe how to configure this.
CDAP nodes require Hadoop and HBase client installation and configuration. Note: No Hadoop services need actually be running.
We recommend installing an NTP (Network Time Protocol) daemon on all nodes of the cluster, including those with CDAP components.
Java Runtime
The latest JDK or JRE version 1.8.xx for Linux, Windows, or Mac OS X must be installed in your environment; we recommend the Oracle JDK.
To check the Java version installed, run the command:
Code Block |
---|
$ java -version
|
CDAP is tested with both the Oracle JDK and the Open JDK; it may work with other JDKs but it has not been tested with them.
Once you have installed the JDK, you'll need to set the JAVA_HOME environment variable.
NTP (Network Time Protocol)
We recommend installing an NTP (Network Time Protocol) daemon on all nodes of the cluster, including those with CDAP components.
NTP requires that port 123 be open.
If your cluster does not have access to the internet, you can run a local version of NTP by setting up a master node as an NTP server.
Installing NTP on RPM using Yum
Install the NTP service and dependencies:
Code Block $ sudo yum install ntp ntpdate ntp-doc
Set the service to start at reboot:
Code Block $ sudo chkconfig ntpd on
Start the NTP server. This will continuously adjust the system time from an upstream NTP server:
Code Block $ sudo /etc/init.d/ntpd start
Synchronize the system clock with the
0.pool.ntp.org
server. You should use this command only once:Code Block $ sudo ntpdate -u pool.ntp.org
Synchronize the hardware clock (to prevent synchronization problems), unless on a virtual server:
Code Block $ sudo hwclock --systohc
Installing NTP on Debian using APT
Install the NTP service and dependencies:
Code Block $ sudo apt-get install ntp
Start the NTP server. This will continuously adjust the system time from an upstream NTP server:
Code Block $ sudo service ntp start
Synchronize the system clock with the
0.pool.ntp.org
server. You should use this command only once:Code Block $ sudo ntpdate -u pool.ntp.org
Synchronize the hardware clock (to prevent synchronization problems), unless on a virtual server:
Code Block $ sudo hwclock --systohc
NTP Troubleshooting and Configuration
To check the synchronization:
Code Block $ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== +173.44.32.10 18.26.4.105 2 u 5 64 1 78.786 -0.157 1.966 *66.241.101.63 132.163.4.103 2 u 7 64 1 43.085 2.872 0.409 +services.quadra 198.60.22.240 2 u 6 64 1 21.805 3.040 1.033 -hydrogen.consta 200.98.196.212 2 u 7 64 1 114.250 16.011 0.873
If you need to adjust the configuration (add or delete servers, use servers closer to you, etc.):
Code Block $ vi /etc/ntp.conf
CDAP and Firewalls
In general, your cluster configuration cannot have a firewall between the cluster and CDAP. Instead, if a firewall is used, the cluster and certain CDAP components need to be together behind the firewall. These are the ports which can be opened to provide external access:
...