Back to Aws Doc Sdk Examples

AWS Entity Resolution Program

scenarios/basics/entity_resolution/README.md

latest1.7 KB
Original Source

AWS Entity Resolution Program

Overview

This AWS Entity Resolution basic scenario demonstrates how to interact with the AWS Entity Resolution service using an AWS SDK. This application demonstrates how to use AWS Entity Resolution to integrate and deduplicate data from multiple sources using machine learning-based matching. The program walks through setting up AWS resources, uploading structured data, defining schema mappings, creating a matching workflow, and running a matching job.

Note: See the specification document for a complete list of operations.

Features

  1. Uses AWS CloudFormation to create necessary resources:
  • AWS Glue Data Catalog table

  • AWS IAM role

  • AWS S3 bucket

  • AWS Entity Resolution Schema

  1. Uploads sample JSON and CSV data to S3

  2. Creates schema mappings for JSON and CSV datasets

  3. Creates and starts an Entity Resolution matching workflow

  4. Retrieves job details and schema mappings

  5. Lists available schema mappings

  6. Tags AWS resources for better organization

  7. Views the results of the workflow

Resources

This Basics scenario requires an IAM role that has permissions to work with the AWS Entity Resolution service, an AWS Glue database, and an S3 bucket. A CDK script is provided to create these resources. See the resources Readme file.

Implementations

This scenario example will be implemented in the following languages:

  • Java
  • Python
  • Kotlin

Additional Reading

Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: Apache-2.0