Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Open the Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce/.

  2. Choose "Create cluster."

  3. In the Advanced OptionsStep 1: Software and Steps, set:

    • Vendor: Amazon

    • Release: emr-4.6.0 through emr-5.3.1

    • Software: Hadoop, HBase, Hive, Spark

    • No auto-terminate


    EMR Create Cluster Wizard: Step 1: Software and Steps

  4. In Step 2: Hardware, set:

    • Network: use defaults

    • EC2 Subnet: use defaults

    • Master

      • EC2 Instance type: m3.xlarge

      • Instance count: 1

    • Core

      • EC2 Instance type: m3.xlarge

      • Instance count: 4 (as a minimum)

    • Task

      • Instance count: 0 (not required)

    EMR Create Cluster Wizard: Step 2: Hardware

  5. In Step 3: General Cluster Settings, set:

    • Logging

    • Debugging

    • Termination protection (no auto-terminate)


    EMR Create Cluster Wizard: Step 3: General Cluster Settings

  6. In Step 3: General Cluster Settings, add a Bootstrap Action:

    • Type: Run If

    • Optional arguments:

      Code Block
      instance.isMaster=true "curl https://downloads.cdap.io/emr/install-6.0.0.sh | sudo bash -s"

    EMR Create Cluster Wizard: Add Bootstrap Action

  7. In Step 4: Security, set following defaults, and then add a security group (next step).

    EMR Create Cluster Wizard: Step 4: Security

  8. In Step 4: Security, set additional EC2 Security Groups to the master node:

    • Master (one of the following):

      • A Security Group with ports 11011/11015 open; or

      • An SSH Tunnel

    EMR Create Cluster Wizard: Assigning additional security group to master node

...