python/cross_service/textract_explorer/README.md
Shows how to use the AWS SDK for Python (Boto3) with Amazon Textract to detect text, form, and table elements in a document image. The input image and Textract output are shown in a Tkinter application that lets you explore the detected elements.
The asynchronous APIs used in this example require an Amazon S3 bucket to contain
input images, an Amazon SNS topic to publish notifications, and an Amazon SQS queue
that the application can poll for notification messages. These resources are managed by
an AWS CloudFormation stack that is defined in the accompanying setup.yaml file.
Deploy prerequisite resources by running the example script with the deploy flag at
a command prompt.
python textract_demo_launcher.py deploy
Run the usage example with the demo flag at a command prompt.
python textract_demo_launcher.py demo
Destroy example resources by running the script with the destroy flag at a command
prompt.
python textract_demo_launcher.py destroy
The example contains the following files.
textract_app.py
A Tkinter application that displays document images, starts Textract synchronous and asynchronous detection processes, and shows the hierarchy of detected elements. Elements can be clicked to explore the hierarchy and draw bounding polygons on the input image.
textract_demo_launcher.py
Launches the Textract demo.
deploy option to deploy prerequisite
resources defined in the setup.yaml CloudFormation stack.demo option to show the Tkinter application.destroy option to destroy prerequisite resources.textract_wrapper.py
Wraps Textract, Amazon S3, Amazon SNS, and Amazon SQS functions that are used by the application.
setup.yaml
Contains a CloudFormation script that is used to create the resources needed for the demo.
The setup.yaml file was built from the
AWS Cloud Development Kit (AWS CDK)
source script here:
/resources/cdk/textract_example_s3_sns_sqs/setup.ts.
The unit tests in this module use the botocore Stubber. The Stubber captures requests before they are sent to AWS, and returns a mocked response. To run all of the tests, run the following command in your [GitHub root]/python/cross_service/textract_explorer folder.
python -m pytest
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: Apache-2.0