Setting up Environment

Setup command line tools and Node

On Mac, using Homebrew:

Following steps are to Install homebrew, git, maven and node.js.

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install git
brew install maven
brew install node.js

On Linux, using apt:

sudo apt update
sudo apt-get install git
sudo apt-get install maven
sudo apt-get install nodejs


Make sure git is version 1.8.x or greater installed on your machine:

$ git --version
git version 2.8.2

Make sure nodejs is version 16 or greater installed on your machine:

$ node -v
v16.17.0

Build

Git clone the CDAP repository and the Hydrator plugins:

git clone git@github.com:cdapio/cdap.git

If you see an error, alternatively you can try using https:

git clone https://github.com/cdapio/cdap.git


Now build using maven:

cd cdap
mvn clean install -DskipTests

If you get compile errors similar to this one:

Exception in thread "main" java.lang.NoSuchMethodError: java.nio.ByteBuffer.mark()Ljava/nio/ByteBuffer;
        at org.eclipse.aether.connector.basic.ChecksumCalculator.update(ChecksumCalculator.java:202)

this is likely due to the version of maven that does not work properly with latest Java versions. Upgrading to maven 3.6.1 will most likely fix the problem. 

If you encounter an error like this one

[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]   
[ERROR]   The project io.cdap.cdap:cdap:6.5.0-SNAPSHOT (/workspace/cdap-build/cdap/pom.xml) has 1 error
[ERROR]     Child module /workspace/cdap-build/cdap/cdap-ui of /workspace/cdap-build/cdap/pom.xml does not exist

run the following submodule followed by above build command

git submodule update --init --recursive --remote


Please refer to Build System & CI for more details on building CDAP.

Build and run Local Standalone CDAP

Follow the steps above to build CDAP and install it to the local maven repository. Then cd out of the cdap directory and clone hydrator-plugins:

cd ..
git clone git@github.com:cdapio/hydrator-plugins.git

If you see an access error, alternatively you can try using https:

git clone https://github.com/cdapio/hydrator-plugins.git



Build the plugins:

note: --remote pulls the latest versions of every submodule, rather than the versions locked in the hydrator-plugins git. This is intentional.

cd hydrator-plugins
git submodule update --init --recursive --remote
mvn clean install -DskipTests 

If you run into errors like this

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project google-cloud: Compilation failure: Compilation failure:
[ERROR] /Users/USR/goog/hydrator-plugins/google-cloud/src/main/java/io/cdap/plugin/gcp/bigquery/connector/BigQueryConnector.java:[45,38] cannot find symbol
[ERROR]   symbol:   class ConnectorContext
[ERROR]   location: package io.cdap.cdap.etl.api.connector

You may need to run the following command in your CDAP directory and then try to build hydrator plugins again

mvn clean install -DskipTests -P templates,spark-dev


If you are using maven 3.8.1 and higher you will run into errors like below as maven blocks external HTTP repositories by default since version 3.8.1. To resolve this issue you will have to remove the maven-default-http-blocker mirror in the <mirrors> section in the settings.xml file located in your maven installation folder. For more information refer this (run ```mvn -version``` to find mvn directory).Else, you may also downgrade to a lower maven version. 

[ERROR] Failed to read artifact descriptor for eigenbase:eigenbase-properties:jar:1.1.4: Could not transfer artifact eigenbase:eigenbase-properties:pom:1.1.4 from/to
[ERROR] maven-default-http-blocker (http://0.0.0.0/): transfer failed for http://0.0.0.0/eigenbase/eigenbase-properties/1.1.4/eigenbase-properties-1.1.4.pom: Connect to
[ERROR] 0.0.0.0:80 [/0.0.0.0] failed: Connection refused (Connection refused) -> [Help 1]

Then cd out of hydrator-plugins back to the cdap directory, and including the plugins:

cd ../cdap
HYDRATOR_PLUGINS=../../hydrator-plugins

If you're on a mac you would do ```export HYDRATOR_PLUGINS=[path-to-hydrator-plugins]``` with the full path instead of the above variable assignment otherwise it won't be able to find the hydrator plugins.

MAVEN_OPTS="-Xmx1024m -XX:MaxPermSize=128m" mvn clean package \
    -pl cdap-standalone,cdap-app-templates/cdap-etl \
    -am -amd -DskipTests \
    -P templates,dist,release,unit-tests \
    -Dadditional.artifacts.dir=$HYDRATOR_PLUGINS
cd cdap-standalone/target
unzip cdap-sandbox-<version>-SNAPSHOT.zip 
cd cdap-sandbox-<version>-SNAPSHOT
cd bin
./cdap sandbox start
  • The UI runs on localhost:11011
  • Wrangle a sample file, build a pipeline, run it.

To restart and stop the sandbox:

./cdap sandbox restart
./cdap sandbox stop


IDE Setup

  • Download the IntelliJ Community Edition from http://www.jetbrains.com/idea/download/ (You can also download from Software Center (search for IntelliJ Community Edition)).
  • Import settings into IntelliJ as explained here:
    • Coding Standards
    • Set Imports: Preferences -> Code Style -> Java -> Imports. Uncheck "Use fully qualified class names in javadocs"

Creating an IntelliJ project

  • Clone the CDAP project, if not done already
  • Open IntelliJ and import the CDAP project
    1. Go to menu File -> Import Project ...
    2. Select the pom.xml under the CDAP directory
    3. Check Import Maven projects automatically and Automatically download: Sources, Documentations boxes in the Import Project from Maven popup.
    4. Click next and the new CDAP project will be created.

Setting up Checkstyle in IntelliJ

For more information on the rules enforced by checkstyle, see Java Coding Standards

In IntelliJ, do this:

  1. Go to menu IntelliJ IDEA -> Preferences…
  2. Expand the Copyright setting on the left (under Project Settings)
  3. Select Copyright Profiles and add a new Copyright Profile (there is a + button in the top-middle)
  4. Give the profile a name (e.g. Cask Apache v2)
  5. Paste the following text to the Copyright text box

    Copyright © $today.year Cask Data, Inc.
    
    Licensed under the Apache License, Version 2.0 (the "License"); you may not
    use this file except in compliance with the License. You may obtain a copy of
    the License at
    
    http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
    License for the specific language governing permissions and limitations under
    the License.
  6. In the Allow replacing copyright if old copyright contains box, enter Copyright.
  7. Click on Copyright on the left again and add a new scope of All with the copyright profile added in above step.
  8. Note: if there is an existing copyright in a file, and you are modifying the file (rather than completely replacing it), extend the copyright rather than replacing it:
    "Copyright © 2014" becomes Copyright © 2014-2016" (or similar).
  9. Click "Apply" and "OK" to complete the steps.

Created in 2020 by Google Inc.