Building CDAP SDK for Windows

Overview

This document contains details about how to build the CDAP SDK for Windows. For the CDAP SDK on Windows, we need two Hadoop binaries - hadoop.dll and winutils.exe. These binaries are not shipped with any hadoop distro and need to be built from source.

Building Hadoop on Windows

  1. clone https://github.com/apache/hadoop.git on your windows laptop. It is better to clone this at the root level (C:\), with a short directory name, since there is a limit to the size of fully qualified paths on Windows. 
  2. Since CDAP currently builds with hadoop 2.3.0, git checkout branch/2.3.0
  3. set Platform=x64 (set this to win32 for 32 bit machines. However, we haven't built for 32 bit machines yet).
  4. You will also need protobuf-2.5.0 to build Hadoop. Protocol Buffers has moved to a new website, where they only host the binaries for the latest version (2.6.1). For older versions, they recommend to build from source. However, this is an impossible (or, extremely difficult) task on windows. Instead, use protoc-2.5.0-win32.zip, which was downloaded from a link on the older protocol buffers website. Extract the zip file, and add it to a location that is in the %PATH% env var. I put this under C:\Windows\System32.
  5. To build hadoop, run 

    mvn clean package -Pnative-win -DskipTests
  6. hadoop.dll and winutils.exe can be found in hadoop-common-project\hadoop-common\target\bin\. 
  7. Every time hadoop version is updated, the above two files will have to be checked in into the cdap repo under cdap-unit-test/src/main/resources before building the SDK.

Created in 2020 by Google Inc.