Security-Impersonation-Namespace Mapping Scenarios
Overview
This page documents various scenarios for security use cases supported in 3.5. The scenarios below apply to the following combinations of security:
Authorization
Authorization + Namespace Mapping
Authorization + Impersonation
Authorization + Impersonation + Namespace mapping
NOTE: In this document,
EntityA --> EntityB indicates a call (method call or RPC) from EntityA to EntityB
Monospace indicates an operation (either method call or RPC)
Bold superscript indicates RPC transport
Bold blue indicates a userId is being set, or read
Bold green indicates impersonation
Bold red indicates an exit with failure
NOTE: This document also assumes that the Authorizer extension is Apache Sentry, so calls out Thrift as the communication mechanism
REST APIs
Publicly routed REST APIs in AppFabric Service
Application Deployment
Applications with non-existing dataset
Client --> Router HTTP:
deployApp(artifact, appConfig)Router --> AppFabric HTTP:
deployApp(artifact, appConfig, SecurityRequestContext.userId)AppFabric --> AuthEnforcer:
!authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionAppFabric --> AppFabric:
doAs(namespace, deploy(jar, config))AppFabric --> DatasetServiceClient:
createDataset()DatasetServiceClient --> DatasetService HTTP
: createDataset(ds, Header(CDAP-UserId=SecurityRequestContext.userId))DatasetService --> AuthEnforcer
: !authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionDatasetService --> Authorizer Thrift:
revoke(ds); grant(ds, SecurityRequestContext.userId, ALL)DatasetService --> DatasetOpExecutor HTTP:
success = doAs(namespace, createDataset(ds))DatasetService --> Authorizer Thrift:
!success ? revoke(ds)DatasetService --> AppFabric --> Router --> Client HTTP:
result
Applications with existing dataset
Client --> Router HTTP:
deployApp(artifact, appConfig)Router --> AppFabric HTTP:
deployApp(artifact, appConfig, SecurityRequestContext.userId)AppFabric --> AuthEnforcer:
!authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionAppFabric --> AppFabric:
doAs(namespace, deploy(jar, config))AppFabric --> DatasetServiceClient: !
compatibleUpdate ? IncompatibleExceptionDatasetServiceClient --> DatasetService HTTP
: update(ds, Header(CDAP-UserId=SecurityRequestContext.userId))DatasetService --> AuthEnforcer
: !authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionDatasetService --> DatasetService:
success = update(ds)DatasetService --> AppFabric --> Router --> Client HTTP:
result
Applications with non-existing streams
Applications with existing streams
Namespace Creation
Client --> Router HTTP:
createNamespace(nsName, nsConfig)Router --> AppFabric HTTP:
createNamespace(nsName, nsConfig, SecurityRequestContext.userId)AppFabric --> AuthEnforcer:
!authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionAppFabric --> Authorizer Thrift:
grant(namespace,SecurityRequestContext.userId, ALL)AppFabric --> DatasetServiceClient:
getDataset(app.meta)DatasetServiceClient --> DatasetService HTTP
: getDataset(app.meta, Header(CDAP-UserId=Principal.SYSTEM))DatasetService --> AuthEnforcer
: result = filter(dataset, SecurityRequestContext.userId)This will always be non-empty, because of the system principalDatasetService --> DatasetServiceClient HTTP —> AppFabric: MDS
AppFabric --> MDS:
store(namespace)AppFabric --> StorageProviderNsAdmin
: result = doAs(nsName, createNamespace(namespaceMeta))This will only check for access for custom mappings, but will create otherwiseAppFabric —> AppFabric:
!result ? revoke(namespace) && NamespaceCannotBeCreatedExceptionAppFabric --> Router --> Client HTTP:
result
Namespace Deletion
Client --> Router HTTP:
deleteNamespace(nsName)Router --> AppFabric HTTP:
deleteNamespace(nsName, SecurityRequestContext.userId)AppFabric --> AuthEnforcer:
!authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionAppFabric --> Authorizer Thrift:
revoke(namespace,SecurityRequestContext.userId, ALL)AppFabric --> DatasetServiceClient:
getDataset(app.meta)DatasetServiceClient --> DatasetService HTTP
: getDataset(app.meta, Header(CDAP-UserId=Principal.SYSTEM))DatasetService --> AuthEnforcer
: result = filter(dataset, SecurityRequestContext.userId)This will always be non-empty, because of the system principalDatasetService --> DatasetServiceClient HTTP —> AppFabric: MDS
AppFabric --> MDS:
delete(namespace)AppFabric --> StorageProviderNsAdmin
: result = doAs(nsName, delete(namespaceMeta))This will only check for access for custom mappings, but will delete otherwiseAppFabric --> Authorizer Thrift:
revoke(namespace,SecurityRequestContext.userId, ALL)AppFabric —> AppFabric:
!result ? NamespaceCannotBeDeletedExceptionAppFabric --> Router --> Client HTTP:
result
Publicly routed REST APIs in Dataset Service
Create
Client --> Router HTTP:
createDataset(dataset, type, properties)Router --> DatasetService HTTP:
createDataset(dataset, type, properties, SecurityRequestContext.userId)DatasetService --> AuthEnforcer
: !authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionDatasetService --> Authorizer Thrift:
revoke(dataset); grant(dataset, SecurityRequestContext.userId, ALL)DatasetService --> DatasetOpExecutor HTTP:
success = doAs(namespace, createDataset(dataset))DatasetService --> Authorizer Thrift:
!success ? revoke(dataset)DatasetService --> Router --> Client HTTP:
result
List
Client --> Router HTTP:
listDatasets(namespace)Router --> DatasetService HTTP:
listDatasets(namespace, SecurityRequestContext.userId)DatasetService --> AuthEnforcer
: result = filter(datasetsInNamespace, SecurityRequestContext.userId)DatasetService --> Router --> Client HTTP:
result
Get
Client --> Router HTTP:
getDataset(dataset)Router --> DatasetService HTTP:
dataset = getDataset(dataset, SecurityRequestContext.userId)DatasetService --> AuthEnforcer
: result = filter(dataset, SecurityRequestContext.userId)DatasetService --> Router --> Client HTTP:
result.isEmpty ? UnauthorizedException
Update
Client --> Router HTTP:
updateDataset(dataset, type, properties)Router --> DatasetService HTTP:
updateDataset(dataset, type, properties, SecurityRequestContext.userId)DatasetService --> AuthEnforcer
: !authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionDatasetService --> DatasetService:
result = update(dataset, type, properties)DatasetService --> Router --> Client HTTP:
result
Truncate
Client --> Router HTTP:
truncate(dataset)Router --> DatasetService HTTP:
truncate(ds, SecurityRequestContext.userId)DatasetService --> AuthEnforcer
: !authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionDatasetService --> DatasetOpExecutor HTTP:
result = doAs(namespace, truncate(dataset))DatasetService --> Router --> Client HTTP:
result
Drop
Client --> Router HTTP:
drop(dataset)Router --> DatasetService HTTP:
drop(dataset, SecurityRequestContext.userId)DatasetService --> AuthEnforcer
: !authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionDatasetService --> DatasetOpExecutor HTTP:
result = doAs(namespace, drop(dataset))DatasetService --> Authorizer Thrift:
revoke(dataset)DatasetService --> Router --> Client HTTP:
result
Upgrade
Client --> Router HTTP:
upgrade(dataset)Router --> DatasetService HTTP:
upgrade(dataset, SecurityRequestContext.userId)DatasetService --> AuthEnforcer
: !authorized(SecurityRequestContext.userId) ? UnauthorizedExceptionDatasetService --> DatasetOpExecutor HTTP:
result = doAs(namespace, upgrade(dataset))DatasetService --> Router --> Client HTTP:
result
Publicly routed REST APIs in Stream Service
Program Runtime
Access datasets, streams and secure keys
During program runtimes, users can access datasets, streams and secure keys through program APIs (MapReduce/Spark/Flows) or through Dataset APIs (getDataset)
Administer datasets, streams and secure keys
During program runtimes, users can administer datasets, streams and secure keys via the Admin APIs
Update system metadata
During program runtimes, CDAP performs various system operations for:
Recording Audit
Recording Lineage
Recording Usage
Recording Run Records
Namespace Lookup
Authorization Enforcement
Explore
Access datasets and streams
Users can execute Hive SELECT (for BatchReadable datasets) and INSERT (for BatchWritable datasets queries via Explore to access data in datasets and streams.
Administer datasets and streams
Create operations on datasets and streams can create tables in Hive if explore is enabled. Similarly, delete can drop and truncate tables.
Authorization Cache Updates
Scratch Pad
a) Authorization
b) Auth + NS
c) Auth + Impersonation
d) Auth + Impersonation +NS
Application deploy -> Create DS and Streams
2. Program Run -> Creating DS and Streams
3. program Run -> Access DS and Streams
4. Explore -> Access Dataset (Explore can insert to DS) INSERT on SELECT
5. REST APIS -> Create DS and Streams
6. REST APIS -> Access DS and Streams
7. Program -> Access System DS for System metadata recording
Replace Create with Create, Delete and Truncate. All of the admin ops should be accounted
8. Create namespace
9. Delete namespace