Authorization Policy Pushdown

Currently, CDAP does not support the pushing of authorization policy grants and revokes to storage providers. As a result, when a user is granted READ or WRITE access on existing datasets, permissions are not updated in the storage providers. The same applies when authorization policies are revoked.

A newly-applied authorization policy will be enforced when the dataset is accessed from CDAP, but not when it is accessed directly in the storage provider. If the pushdown of permissions to storage providers is desired, it needs to be done manually.

This limitation has a larger implication when cross-namespace dataset access is used. When accessing a dataset from a different namespace, CDAP currently presumes that the user accessing the dataset has been granted permissions on the dataset in the storage provider prior to accessing the dataset from CDAP.

For example, if a program in the namespace ns1 tries to access a fileset in the namespace ns2, the user running the program should be granted the appropriate (READWRITE, or both) privileges on the fileset. Additionally, the user needs to be granted appropriate permissions on the HDFS directory that the fileset points to. When impersonation is used in the program's namespace, this user is the impersonated user, otherwise it is the user that the CDAP Master runs as.

Created in 2020 by Google Inc.