There are a few things that need to be added to the CDAP installation documentation for a MapR clusters.
The section "Downloading and Distributing Packages" needs to have the distributions updated. We test the latest MapR release(5.2) in our nightly builds.
The sections "HDFS Permissions" and "CDAP Configuration" say to "su hdfs" but in MapR everything runs as the mapr user. We should add a note that explains that when changing hdfs permissions use mapr.
For distributes MapR clusters we need to manually create the cdap user for each node in the cluster. We also need to make sure that the UID for the cdap user is the same on each node. "MapR uses each node's native operating system configuration to authenticate users and groups for access to the cluster." -MapR site
Since all of the Hadoop services run as the mapr user, the hive DB directory (default: /user/hive) is only accessible by the mapr user. In order for CDAP to create hive tables, we need to open the permission of this dir. In other distributions this is done by setting the permission of this dir to 1777. hadoop fs -chmod 1777 /user/hive/
Under the CDAP Configuration section, we discuss configuring the ZooKeeper quorum. In the example, we show the default port 2181. The default for MapR is 5181. It would be helpful for users to see a note about the default port being different.
If MapR is installed through the installer the default spark version 2.0.1. This version is not currently supported. We currently support the latest 1.x version of Spark. In order to use Spark, it would have to be manually installed through packages. http://maprdocs.mapr.com/home/AdvancedInstallation/InstallSparkonYARN.html
There are a few things that need to be added to the CDAP installation documentation for a MapR clusters.
The section "Downloading and Distributing Packages" needs to have the distributions updated. We test the latest MapR release(5.2) in our nightly builds.
The sections "HDFS Permissions" and "CDAP Configuration" say to "su hdfs" but in MapR everything runs as the mapr user. We should add a note that explains that when changing hdfs permissions use mapr.
For distributes MapR clusters we need to manually create the cdap user for each node in the cluster. We also need to make sure that the UID for the cdap user is the same on each node. "MapR uses each node's native operating system configuration to authenticate users and groups for access to the cluster." -MapR site
Since all of the Hadoop services run as the mapr user, the hive DB directory (default: /user/hive) is only accessible by the mapr user. In order for CDAP to create hive tables, we need to open the permission of this dir. In other distributions this is done by setting the permission of this dir to 1777.
hadoop fs -chmod 1777 /user/hive/
Under the CDAP Configuration section, we discuss configuring the ZooKeeper quorum. In the example, we show the default port 2181. The default for MapR is 5181. It would be helpful for users to see a note about the default port being different.
If MapR is installed through the installer the default spark version 2.0.1. This version is not currently supported. We currently support the latest 1.x version of Spark. In order to use Spark, it would have to be manually installed through packages. http://maprdocs.mapr.com/home/AdvancedInstallation/InstallSparkonYARN.html