Goal

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Ability to map a namespace/app/program and CDAP user to a Kerberos principal, and execute user operations with a particular principal.

Checklist

User stories documented (Ali)
User stories reviewed (Nitin)
Design documented (Ali)
Design reviewed (Andreas/Terence)
Feature merged (Ali)
Blog post

User Stories

As a CDAP admin, I would like to map a namespace, application, program, or schedule to a Kerberos principal. When CDAP application are submitted to YARN, the applications should be ran as that user.
As a CDAP admin, I should be able to map a CDAP user/principal to a Kerberos principal, for the same reasons as the previous story.
As a CDAP application developer, my application should access HDFS, HBase, Hive, and other resources as myself, instead of the global 'cdap' (or other, configured) user.

User-facing Design

To support mapping a namespace to "service account user" and principal, we will need to accept and store this custom mapping during namespace create operation. This mapping will be stored in NamespaceConfig which currently stores custom yarn queue names and is used by the NamespaceMeta. We will add an additional field to it, which will define the principal to be used under that namespace.
It is not yet clear how the credentials will be configured by the user; this will be flushed out as implementation continues. One possibility is to require that the user configure a kerberos keytab file located on HDFS:

NamespaceConfig

/**
 * Represents the configuration of a namespace. This class needs to be GSON serializable.
 */
public class NamespaceConfig {
   
  ...

  @SerializedName("user.principal")
  private final String userPrincipal;

 
  // location (on HDFS) of the credentials for the above principal
  @SerializedName("keytab.file")
  private final String keytabFileLocation;
  ...

}

We also need to support a mapping at the application and program (and possibly schedule) level.
In order to do this, there are two options:

Store this metadata in app metastore on its own and have additional handlers to allow the user to set/get these.
Allow the user to define a custom 'preference' that will dictate this mapping. For instance, the user setting a 'user.principal' preference at the application level will make it so that the application will be run with the specified principal.

I am leaning towards option #1, because it keeps the configuration of principal and keytab location separate/independent than other user preferences (which are available as runtime arguments in programs).
Pending: I will add more details to how the user will interact with instance-, app-, program-, and schedule-level configuration later.

Resolution of principal
When a program is launched, the principal to be used will be determined based upon configuration at the following levels. Whichever level it is found at first will be used:

Schedule
Program
Application
Namespace
CDAP instance

For example, if a schedule has an associated principal, and the application also has an associated principal, the schedule-level setting will be used.
If there is no schedule-level, program-level, or app-level setting, but there is a namespace-level setting, then the namespace-level setting will be used.
In the previous example, if no namespace-level setting was defined, it would default to the configuration at the CDAP instance level. Pending - what happens if not even this is defined? Should this be required?

Implementation Design

User-launched programs

Hadoop's UserGroupInformation class has the following method:

// Log a user in from a keytab file.
UserGroupInformation loginUserFromKeytabAndReturnUGI(String user, String path);

With this, we can impersonate the user with the following steps:

When CDAP master launches a program in YARN via Twill, we will resolve the user and path to the keytab file based upon the user-provided configuration.
Using these credentials, we will login with the above UGI method.
We can then use UserGroupInformation#doAs to execute actions (such as submitting the YARN application via Twill), and the appropriate delegation tokens will get added to the YARN application.

Schedule-launched programs

A similar approach can be done for programs launched by a scheduler. The only difference would be that the principal and credentials would be resolved by the scheduler, instead.

System-executed operations on user data (dataset admin ops and namespace ops)

When the CDAP system performs dataset operations (create/delete/truncate/upgrade hbase tables, for instance), it is acting on user datasets. Because of this and the fact that we do not want the cdap system user to have superuser privileges, we need to impersonate users when executing these dataset admin operations.
To implement this, we'll have a DelegatingDatasetAdmin which will perform all of its operations for a particular UGI.
StorageProviderNamespaceAdmin will also have to perform all of its operations for a particular UGI (i.e. namespace create and namespace delete).

Upgrade Tool changes (TBD)

Very likely, upgrade tool will also have to follow a similar pattern as dataset op executor service.
Other miscellaneous tools that interact with user data: Flowlet pending metrics corrector, Flowlet queue inspector.

Streams (TBD)

StreamWriters are system code, but writing to user Streams, so this should also be impersonated.
It is not yet determined how impersonation will work here, but the above approach can not be used in this case.
An implementation of design for this will be flushed out later. A couple of things to consider when thinking about the design later:

Multiple delegation tokens in a StreamWriter, in order to handle multiple users' streams?
What is the cost of switcher user from the StreamWriters (performance impact)?
Running Writers in separate containers, to avoid cost of switching?

Launching of flows (TBD)

When a flow program is launched for the first time, CDAP Master will create an HBase table in the user's namespace to track pending events of queues (which events a particular flowlet has processed, and which are unprocessed). During execution of the flow's flowlets, the flowlets will read and update this table. Because of this, the hbase table should be created by the user that launches the flow, or at least readable and writable by that user.
Design of the necessary implementation for this has not been flushed out either, and will come later.

Explore Queries (TBD)

Explore queries are initiated by the CDAP user and operate on user data, even though they are launched from a system container. Because of that, impersonation will also need to be implemented for explore queries.

Design of the necessary implementation for this has not been flushed out either, and will come later.

Brief summary of overall changes

During program runtime, cdap master will impersonate a user and launch the YARN app. This will make it so that cdap programs run as various users.
1. Because these users will not have access to system tables, they will go through CDAP system services for writing to system tables (run records, lineage, usage, workflow token).
During namespace operations (create/delete), dataset service will perform the namespace create and delete operations (HBase namespace, HDFS directories, explore database), while impersonating the configured user.
During dataset admin operations (create/delete/truncate), dataset op executor service will perform the operations while impersonating the configured user.
(to be finalized) Stream admin operations as well as stream writing operations will have to happen while impersonating the configured user.
(to be finalized) Explore queries launched will have to happen while impersonating the configured user.
(to be finalized) Artifact deployment will also need to impersonate the user, when deploying artifact in user scope.

Note: any time that a system service wishes to impersonate a user, it will involve looking up the configured principal/keytab, then localizing the keytab from distributed file system, and creating a UGI based upon this keytab. A caching mechanism for these UGI's would be useful.

Problems Encountered

User applications writing to CDAP System tables

One of the aspects of impersonation that we did not consider is that YARN applications corresponding to a CDAP program will no longer have permissions as the 'cdap' system user. For instance, if the program is configured to be launched as user 'joe', it is not guaranteed that 'joe' has access to the 'cdap_system' hbase namespace or to system tables. However, the yarn application still (currently) writes to system tables.
Here are examples of when a user program writes to system tables:

Update run records (started, stopped, etc).
Updates to workflow token
Lineage and usage registry updates upon calls to getDataset and addInput / addOutput
Reading config store, for resolving runtime arguments, when launching programs from workflows

One possible solution to this is to still launch the YARN applications as the 'cdap' system user and only execute user code within it as the impersonated user (i.e. 'joe'). However, this does not doable because even when control is passed to the user, writes to system tables still can happen - for instance, when user calls getDataset or updateWorkflowToken.
An alternate solution is to expose a service in master (app-fabric?) that exposes functionality for specific writes to system tables. This would be a service that the user yarn application would call whenever it wants to record run record data, updates to workflow token, lineage, etc.
There are certainly downsides to this:

Inefficiency - it would be more efficient for the client calling this service to directly make the writes/updates to hbase
Potential bottleneck - if there are N workflows, each updating workflow token, this service would be a bottleneck for all of them

Any thoughts on this approach, or workable alternatives to this, are welcome.

Pending Questions

How will admins configure multiple keytabs (for the various configured principals).
Should we restrict updates to particular fields of the NamespaceConfig? Making it a 'final' configuration may simplify edge cases of the implementation, and will also reduce runtime failures. For instance, if user changes the principal of a namespace, the user would have to ensure that this new principal has all the appropriate permissions.
When launching jobs through twill, staging directory is always cdap/twill/...; Do we need to change twill to pass in staging dir through prepareRun?
If a user is logged into cdap as 'ali', shouldn't we run the YARN app as user 'ali', instead of the mapping configured on the namespace/app/etc.?
Programs launched by workflow - how will the appropriate principal be used for the launched programs (Mapreduce, Spark, Custom Action, etc).