...
Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/.
Choose "Create cluster."
In the Advanced Options, Step 1: Software and Steps, set:
Vendor: Amazon
Release:
emr-4.6.0
throughemr-5.3.1
Software: Hadoop, HBase, Hive, Spark
No auto-terminate
EMR Create Cluster Wizard: Step 1: Software and StepsIn Step 2: Hardware, set:
Network: use defaults
EC2 Subnet: use defaults
Master
EC2 Instance type:
m3.xlarge
Instance count: 1
Core
EC2 Instance type:
m3.xlarge
Instance count: 4 (as a minimum)
Task
Instance count: 0 (not required)
EMR Create Cluster Wizard: Step 2: Hardware
In Step 3: General Cluster Settings, set:
Logging
Debugging
Termination protection (no auto-terminate)
EMR Create Cluster Wizard: Step 3: General Cluster Settings
In Step 3: General Cluster Settings, add a Bootstrap Action:
Type: Run If
Optional arguments:
Code Block instance.isMaster=true "curl https://downloads.cask.co/emr/install-6.2.0.sh | sudo bash -s"
EMR Create Cluster Wizard: Add Bootstrap Action
In Step 4: Security, set following defaults, and then add a security group (next step).
EMR Create Cluster Wizard: Step 4: Security
In Step 4: Security, set additional EC2 Security Groups to the master node:
Master (one of the following):
A Security Group with ports 11011/11015 open; or
An SSH Tunnel
EMR Create Cluster Wizard: Assigning additional security group to master node
Once the cluster is created, CDAP services will start up. This will take about 10 minutes after the cluster is in a Waiting state.
Verification
CDAP Smoke Test
The CDAP UI may initially show errors while all of the CDAP YARN containers are starting up. Allow for up to a few minutes for this.
The Administration page of the CDAP UI shows the status of the CDAP services. It can be reached at http://<cdap-host>:11011/cdap/administration
, substituting for <cdap-host>
the host name or IP address of the CDAP server:
...
CDAP UI: Showing started-up, Administration page.