Introduction
This wiki will outline how we plan to orchestrate the execution of CDAP programs on top of Kubernetes.
Summary
Kubernetes requires that a Docker image be created in order to run something, so the process of executing a program will look like:
- Create a Docker image from the program jar and its dependencies.
- Upload this image to a local repository.
- Point Kubernetes to this image and execute the program.
Creating the Docker image Options to explore:(options)
- Java Programmatic API around Docker client: https://github.com/docker-java/docker-java.
- Bazel - Java-based build system that can build Docker images.
See also: https://medium.com/bitnami-perspectives/building-docker-images-without-docker-c619061b13a9
See also: https://blog.bazel.build/2015/07/28/docker_build.html - Construct a docker command string and leverage shell utilities from Java.
Hosting the Docker image (options)
- Docker Registry - a stateless server-side application used for storing and distributing Docker images.
- Docker Hub - might be too heavyweight and reliant on external services for our use case.
- Quay (from CoreOS) - not free or open source, so not high on the list.
Miscellaneous
There is an experimental project which supports running Spark programs on Kubernetes. "The feature set is currently limited and not well-tested. This should not be used in production environments." https://apache-spark-on-k8s.github.io/userdocs/running-on-kubernetes-cloud.html
...