/
Running on Kubernetes

Running on Kubernetes

Introduction
This wiki will outline how we plan to orchestrate the execution of CDAP programs on top of Kubernetes.

 

Summary

Kubernetes requires that a Docker image be created in order to run something, so the process of executing a program will look like:

  1. Create a Docker image from the program jar and its dependencies.
  2. Upload this image to a local repository.
  3. Point Kubernetes to this image and execute the program.

 

Creating the Docker image (options)

  1. Java Programmatic API around Docker client: https://github.com/docker-java/docker-java.
  2. Bazel - Java-based build system that can build Docker images.
    See also: https://medium.com/bitnami-perspectives/building-docker-images-without-docker-c619061b13a9
    See also: https://blog.bazel.build/2015/07/28/docker_build.html
  3. Construct a docker command string and leverage shell utilities from Java.

 

Hosting the Docker image (options)

  1. Docker Registry - a stateless server-side application used for storing and distributing Docker images.
  2. Docker Hub - might be too heavyweight and reliant on external services for our use case.
  3. Quay (from CoreOS) - not free or open source, so not high on the list.

 

Miscellaneous

 

TODO:

  • Have some numbers around building a Docker image.
  • How can Kubernetes be the runtime under the Twill API, instead of YARN? What are the issues with this integration? What in the Twill API can't be supported?
  • Is there a programmatic API (or at least RESTful) around Kubernetes command-line?
  • How can CDAP master talk to the Kubernetes master to get program status (or any of the Kubernetes interactions)?
  • How long will a Docker image take to run a CDAP program - with and without a base image that has as much as possible of the common stuff?
  • How can we leverage functionality in Kubernetes to avoid a dependency on Zookeeper? Or should we just use etcd regardless of whether we're using Kubernetes or not?
  • Do we need provisioner hooks? For instance, to kick off an instance of Docker Registry after provisioning a Kubernetes cluster?
  • Do research about difficulty of use for YARN vs Kubernetes, ZooKeeper vs etcd.

 

 

Related content

Installing CDAP on Kubernetes
Installing CDAP on Kubernetes
More like this
CDAP Cloud - Program Launcher
CDAP Cloud - Program Launcher
More like this
Compute Cloud Support
Compute Cloud Support
More like this
CDAP Sandbox
CDAP Sandbox
More like this
CDAP Microservices Guide
CDAP Microservices Guide
More like this
Instance ID generation in Kubernetes
Instance ID generation in Kubernetes
More like this

Created in 2020 by Google Inc.