docs/plugins.md
Plugins are way to enhance the basic DataHub functionality in a custom manner.
Currently, DataHub formally supports 2 types of plugins:
Note: This is in <b>BETA</b> version
It is recommend that you do not do this unless you really know what you are doing
Custom authentication plugin makes it possible to authenticate DataHub users against any Identity Management System. Choose your Identity Management System and write custom authentication plugin as per detail mentioned in this section.
Currently, custom authenticators cannot be used to authenticate users of DataHub's web UI. This is because the DataHub web app expects the presence of 2 special cookies PLAY_SESSION and actor which are explicitly set by the server when a login action is performed. Instead, custom authenticators are useful for authenticating API requests to DataHub's backend (GMS), and can stand in addition to the default Authentication performed by DataHub, which is based on DataHub-minted access tokens.
The sample authenticator implementation can be found at Authenticator Sample
Add datahub-auth-api as compileOnly dependency: Maven coordinates of datahub-auth-api can be found at Maven
Example of gradle dependency is given below.
dependencies {
def auth_api = 'io.acryl:datahub-auth-api:0.9.3-3rc3'
compileOnly "${auth_api}"
testImplementation "${auth_api}"
}
Implement the Authenticator interface: Refer Authenticator Sample
<details> <summary>Sample class which implements the Authenticator interface</summary>public class GoogleAuthenticator implements Authenticator {
@Override
public void init(@Nonnull Map<String, Object> authenticatorConfig, @Nullable AuthenticatorContext context) {
// Plugin initialization code will go here
// DataHub will call this method on boot time
}
@Nullable
@Override
public Authentication authenticate(@Nonnull AuthenticationRequest authenticationRequest)
throws AuthenticationException {
// DataHub will call this method whenever authentication decisions are need to be taken
// Authenticate the request and return Authentication
}
}
Use getResourceAsStream to read files: If your plugin read any configuration file like properties or YAML or JSON or xml then use this.getClass().getClassLoader().getResourceAsStream("<file-name>") to read that file from DataHub GMS plugin's class-path. For DataHub GMS resource look-up behavior please refer Plugin Installation section. Sample code of getResourceAsStream is available in sample Authenticator plugin TestAuthenticator.java.
Bundle your Jar: Use com.gradleup.shadow gradle plugin to create an uber jar.
To see an example of building an uber jar, check out the build.gradle file for the apache-ranger-plugin file of Apache Ranger Plugin for reference.
Exclude signature files as shown in below shadowJar task.
apply plugin: 'com.gradleup.shadow';
shadowJar {
// Exclude com.datahub.plugins package and files related to jar signature
exclude "META-INF/*.RSA", "META-INF/*.SF","META-INF/*.DSA"
}
Refer section Plugin Installation for plugin installation in DataHub environment
By default, authentication is disabled in DataHub GMS.
Follow below steps to enable GMS authentication
Download docker-compose.quickstart.yml: Download docker compose file docker-compose.quickstart.yml
Set environment variable: Set METADATA_SERVICE_AUTH_ENABLED environment variable to true
Redeploy DataHub GMS: Below is quickstart command to redeploy DataHub GMS
datahub docker quickstart -f docker-compose.quickstart.yml
Note: This is in <b>BETA</b> version
It is recommend that you do not do this unless you really know what you are doing
Custom authorization plugin makes it possible to authorize DataHub users against any Access Management System. Choose your Access Management System and write custom authorization plugin as per detail mentioned in this section.
The sample authorizer implementation can be found at Authorizer Sample
Add datahub-auth-api as compileOnly dependency: Maven coordinates of datahub-auth-api can be found at Maven
Example of gradle dependency is given below.
dependencies {
def auth_api = 'io.acryl:datahub-auth-api:0.9.3-3rc3'
compileOnly "${auth_api}"
testImplementation "${auth_api}"
}
Implement the Authorizer interface: Authorizer Sample
<details> <summary>Sample class which implements the Authorization interface </summary> public class ApacheRangerAuthorizer implements Authorizer {
@Override
public void init(@Nonnull Map<String, Object> authorizerConfig, @Nonnull AuthorizerContext ctx) {
// Plugin initialization code will go here
// DataHub will call this method on boot time
}
@Override
public AuthorizationResult authorize(@Nonnull AuthorizationRequest request) {
// DataHub will call this method whenever authorization decisions are need be taken
// Authorize the request and return AuthorizationResult
}
@Override
public AuthorizedActors authorizedActors(String privilege, Optional<ResourceSpec> resourceSpec) {
// Need to add doc
}
}
Use getResourceAsStream to read files: If your plugin read any configuration file like properties or YAML or JSON or xml then use this.getClass().getClassLoader().getResourceAsStream("<file-name>") to read that file from DataHub GMS plugin's class-path. For DataHub GMS resource look-up behavior please refer Plugin Installation section. Sample code of getResourceAsStream is available in sample Authenticator plugin TestAuthenticator.java.
Bundle your Jar: Use com.gradleup.shadow gradle plugin to create an uber jar.
To see an example of building an uber jar, check out the build.gradle file for the apache-ranger-plugin file of Apache Ranger Plugin for reference.
Exclude signature files as shown in below shadowJar task.
apply plugin: 'com.gradleup.shadow';
shadowJar {
// Exclude com.datahub.plugins package and files related to jar signature
exclude "META-INF/*.RSA", "META-INF/*.SF","META-INF/*.DSA"
}
Install the Plugin: Refer to the section (Plugin Installation)[#plugin_installation] for plugin installation in DataHub environment
DataHub's GMS Service searches for the plugins in container's local directory at location /etc/datahub/plugins/auth/. This location will be referred as plugin-base-directory hereafter.
For docker, we set docker-compose to mount ${HOME}/.datahub directory to /etc/datahub directory within the GMS containers.
Follow below steps to install plugins:
Lets consider you have created an uber jar for authorizer plugin and jar name is apache-ranger-authorizer.jar and class com.abc.RangerAuthorizer has implemented the Authorizer interface.
Create a plugin configuration file: Create a config.yml file at ${HOME}/.datahub/plugins/auth/. For more detail on configuration refer Config Detail section
Create a plugin directory: Create plugin directory as apache-ranger-authorizer, this directory will be referred as plugin-home hereafter
mkdir -p ${HOME}/.datahub/plugins/auth/apache-ranger-authorizer
Copy plugin jar to plugin-home: Copy apache-ranger-authorizer.jar to plugin-home
copy apache-ranger-authorizer.jar ${HOME}/.datahub/plugins/auth/apache-ranger-authorizer
Update plugin configuration file: Add below entry in config.yml file, the plugin can take any arbitrary configuration under the "configs" block. in our example, there is username and password
plugins:
- name: "apache-ranger-authorizer"
type: "authorizer"
enabled: "true"
params:
className: "com.abc.RangerAuthorizer"
configs:
username: "foo"
password: "fake"
Restart datahub-gms container:
On startup DataHub GMS service performs below steps
config.ymlenabled is set to truename in plugin-base-directory. In this case it is /etc/datahub/plugins/auth/apache-ranger-authorizer/, this directory will become plugin-homeparams.jarFileName attribute otherwise look for jar having name as <plugin-name>.jar. In this case it is /etc/datahub/plugins/auth/apache-ranger-authorizer/apache-ranger-authorizer.jarparams.className attribute from the jar, here load class com.abc.RangerAuthorizer from apache-ranger-authorizer.jarinit method of pluginOn method call of getResourceAsStream DataHub GMS service looks for the resource in below order.
plugin-home directory. if found then return the resource as InputStream.null as requested resource is not found.By default, authentication is disabled in DataHub GMS, Please follow section Enable GMS Authentication to enable authentication.
Helm support is coming soon.
A sample config.yml can be found at config.yml.
config.yml structure:
| Field | Required | Type | Default | Description |
|---|---|---|---|---|
| plugins[].name | ✅ | string | name of the plugin | |
| plugins[].type | ✅ | enum[authenticator, authorizer] | type of plugin, possible values are authenticator or authorizer | |
| plugins[].enabled | ✅ | boolean | whether this plugin is enabled or disabled. DataHub GMS wouldn't process disabled plugin | |
| plugins[].params.className | ✅ | string | Authenticator or Authorizer implementation class' fully qualified class name | |
| plugins[].params.jarFileName | string | default to plugins[].name.jar | jar file name in plugin-home | |
| plugins[].params.configs | map<string,object> | default to empty map | Runtime configuration required for plugin |
plugins[] is an array of plugin, where you can define multiple authenticator and authorizer plugins. plugin name should be unique in plugins array.
Adhere to below plugin access control to keep your plugin forward compatible.
plugin-home directory only. Refer Plugin Installation step2 for plugin-home definitionAll other access are forbidden for the plugin.
Disclaimer: In BETA version your plugin can access any port and can read/write to any location on file system, however you should implement the plugin as per above access permission to keep your plugin compatible with upcoming release of DataHub.
If you have any custom Authentication or Authorization plugin define in authorization or authentication section of application.yaml then migrate them as per below steps.
Implement Plugin: For Authentication Plugin follow steps of Implementing an Authentication Plugin and for Authorization Plugin follow steps of Implementing an Authorization Plugin
Install Plugin: Install the plugins as per steps mentioned in Plugin Installation. Here you need to map the configuration from application.yaml to configuration in config.yml. This mapping from application.yaml to config.yml is described below
Mapping for Authenticators
a. In config.yml set plugins[].type to authenticator
b. authentication.authenticators[].type is mapped to plugins[].params.className
c. authentication.authenticators[].configs is mapped to plugins[].params.configs
Example Authenticator Plugin configuration in config.yml
plugins:
- name: "apache-ranger-authenticator"
type: "authenticator"
enabled: "true"
params:
className: "com.abc.RangerAuthenticator"
configs:
username: "foo"
password: "fake"
Mapping for Authorizer
a. In config.yml set plugins[].type to authorizer
b. authorization.authorizers[].type is mapped to plugins[].params.className
c. authorization.authorizers[].configs is mapped to plugins[].params.configs
Example Authorizer Plugin configuration in config.yml
plugins:
- name: "apache-ranger-authorizer"
type: "authorizer"
enabled: "true"
params:
className: "com.abc.RangerAuthorizer"
configs:
username: "foo"
password: "fake"
Move any other configurations files of your plugin to plugin_home directory. The detail about plugin_home is mentioned in Plugin Installation section.