Run-time Impersonation support for Pipelines

Introduction

The purpose of the document is to capture requirements as well as implementation details of adding Run-time impersonation support for CDAP Pipelines developed through UI.

Requirements

Here's are the requirements for this feature:

  1. In NIFI, users can provide Kerberos principal name and path to keytab in the flow/processor properties which is used during execution to impersonate user. In a similar fashion, user should be able to provide principal name/keytab location as arguments/properties to CDAP pipelines at run-time which should be then be used for impersonation.
  2. Flexibility to provide new Principal and keytab properties for a pipeline on UI as well as REST API.
  3. It should be possible to run the same CDAP pipeline again with different values of Kerberos principal and keytab properties (possibly by the same user).

Implementation

  1. User can provide pipeline impersonation information as run-time arguments ('system.runtime.keytab.path', 'system.runtime.principal.name') through ‘Run’ option on the UI. User then runs the pipeline.
    Execution flow comes to createUGI() API in DefaultUGIProvider where we check if the entityId is of type ProgramRunId and extract all pipeline run-time arguments as a Map.
  2. We then check if the above impersonation properties are present in map or not.
  3. If run-time impersonation properties are present, we create a UGI using API UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab) and return this UGI. The application will be impersonated using the provided run-time principal/keytab.
  4. If either or both of the run-time impersonation properties are absent, CDAP fallback to pre-existing behavior.
  5. For Authorization of user to use any Kerberos principal, authorization checks have been added in ProgramLifecycleService APIs run() and start(). Using AuthorizationEnforcer instance, we check if current user has 'admin' privilege access on Kerberos principal specified in run-time arguments. If yes, pipeline is run else exception is thrown to the caller and pipeline fails.

Created in 2020 by Google Inc.