Back to Dolphinscheduler

SageMaker Node

docs/docs/en/guide/task/sagemaker.md

3.4.12.4 KB
Original Source

SageMaker Node

Overview

Amazon SageMaker is a fully managed machine learning service. With Amazon SageMaker, data scientists and developers can quickly build and train machine learning models, and then deploy them into a production-ready hosted environment.

Amazon SageMaker Model Building Pipelines is a tool for building machine learning pipelines that take advantage of direct SageMaker integration.

For users using big data and machine learning, SageMaker task plugin help users connect big data workflows with SageMaker usage scenarios.

DolphinScheduler SageMaker task plugin features are as follows:

  • Start a SageMaker pipeline execution. Continuously get the execution status until the pipeline completes execution.

Create Task

  • Click Project -> Management-Project -> Name-Workflow Definition, and click the "Create Workflow" button to enter the DAG editing page.
  • Drag from the toolbar task node to canvas.

Task Example

Here are some specific parameters for the SagaMaker plugin:

  • SagemakerRequestJson: Request parameters of StartPipelineExecution,see also AWS API

The task plugin are shown as follows:

Environment to prepare

Some AWS configuration is required, modify a field in file aws.yaml

yaml
sagemaker:
  # The AWS credentials provider type. support: AWSStaticCredentialsProvider, InstanceProfileCredentialsProvider
  # AWSStaticCredentialsProvider: use the access key and secret key to authenticate
  # InstanceProfileCredentialsProvider: use the IAM role to authenticate
  credentials.provider.type: AWSStaticCredentialsProvider
  access.key.id: <access.key.id>
  access.key.secret: <access.key.secret>
  region: <region>
  endpoint: <endpoint>