scenarios/basics/redshift/SPECIFICATION.md
This SDK getting started scenario demonstrates how to interact with Amazon Redshift using the AWS SDK. It demonstrates various tasks such as creating a Redshift cluster, verifying its readiness, creating a table, populating the table with data, executing SQL queries, and finally cleaning up resources. Its purpose is to demonstrate how to get up and running with Amazon Redshift and the AWS SDK.
The only required resource for this SDK getting started scenario is the Movie.json file. This can be located in the aws-doc-sdk-examples\resources\sample_files folder.
The following user input is required for this SDK getting started scenario:
All SQL statements that include user input or variable data MUST use parameterized queries to prevent SQL injection vulnerabilities. This applies to:
Example of parameterized query usage:
SELECT * FROM Movies WHERE year = 2013SELECT * FROM Movies WHERE year = :year with a parameter binding for :yearThis security best practice ensures that user input is properly escaped and prevents malicious SQL code injection.
This program is intended for users not familiar with the Redshift SDK to easily get up an running. The logic is to show use of redshiftClient.describeClustersPaginator().
The following shows the output of the program in the console.
Cluster identifier: redshift-cluster-1 status = available
Cluster identifier: redshift-cluster-wf status = available
The SDK getting started scenario executes the following steps:
redshiftClient.createCluster().redshiftClient.describeClusters.redshiftDataClient.listDatabasesPaginator().redshiftDataClient.executeStatement().redshiftDataClient.executeStatement() to insert movie data.redshiftDataClient.executeStatement(), to retrieve movies released in the specified year. The result set is displayed in the program.redshiftClient.modifyCluster().redshiftClient.deleteCluster().The following shows the output of the program in the console.
--------------------------------------------------------------------------------
Welcome to the Amazon Redshift SDK Getting Started scenario.
This Java program demonstrates how to interact with Amazon Redshift by using the AWS SDK for Java (v2).
Amazon Redshift is a fully managed, petabyte-scale data warehouse service hosted in the cloud.
The program's primary functionalities include cluster creation, verification of cluster readiness,
list databases, table creation, data population within the table, and execution of SQL statements.
Furthermore, it demonstrates the process of querying data from the Movie table.
Upon completion of the program, all AWS resources are cleaned up.
Lets get started...
Please enter your user name (default is awsuser)
--------------------------------------------------------------------------------
Please enter your user password (default is AwsUser1000)
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
A Redshift cluster refers to the collection of computing resources and storage that work together to process and analyze large volumes of data.
Enter a cluster id value (default is redshift-cluster-movies):
Created cluster redshift-cluster-movies
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Wait until redshift-cluster-movies is available.
Press Enter to continue...
Waiting for cluster to become available. This may take a few mins.
Elapsed Time: 00:02 - Waiting for cluster...
Elapsed Time: 00:07 - Waiting for cluster...
Elapsed Time: 00:12 - Waiting for cluster...
Elapsed Time: 00:18 - Waiting for cluster...
Elapsed Time: 00:23 - Waiting for cluster...
Elapsed Time: 00:28 - Waiting for cluster...
Elapsed Time: 00:33 - Waiting for cluster...
Elapsed Time: 00:39 - Waiting for cluster...
Elapsed Time: 00:44 - Waiting for cluster...
Elapsed Time: 00:49 - Waiting for cluster...
Cluster is available! Total Elapsed Time: 00:54
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
When you created redshift-cluster-movies, the dev database is created by default and used in this scenario.
To create a custom database, you need to have a CREATEDB privilege.
For more information, see the documentation here: https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_DATABASE.html.
Press Enter to continue...
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
List databases in redshift-cluster-movies
Press Enter to continue...
The database name is : awsdatacatalog
The database name is : dev
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Now you will create a table named Movies.
Press Enter to continue...
Table created: Movies
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Populate the Movies table using the Movies.json file.
Specify the number of records you would like to add to the Movies Table.
Please enter a value between 50 and 200.
Enter a value: 50
Inserted: Rush (2013)
Inserted: Prisoners (2013)
Inserted: The Hunger Games: Catching Fire (2013)
Inserted: Thor: The Dark World (2013)
Inserted: This Is the End (2013)
Inserted: Despicable Me 2 (2013)
50 records were added to the Movies table.
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Query the Movies table by year. Enter a value between 2012-2014.
Enter a year: 2013
The identifier of the statement is 9af7e953-4b23-4cc6-a7d0-d6f6ab30988c
...PICKED
...STARTED
...STARTED
...STARTED
...FINISHED
The statement is finished!
The Movie title field is Rush
The Movie title field is Prisoners
The Movie title field is The Hunger Games: Catching Fire
The Movie title field is Thor: The Dark World
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Now you will modify the Redshift cluster.
Press Enter to continue...
The modified cluster was successfully modified and has wed:07:30-wed:08:00 as the maintenance window
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Would you like to delete the Amazon Redshift cluster? (y/n)
The redshift-cluster-movies was deleted
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
This concludes the Amazon Redshift SDK Getting Started scenario.
--------------------------------------------------------------------------------
The following table lists the exceptions that should be caught and handled for each action in the scenario:
| Action | Exception | Handling |
|---|---|---|
createCluster | ClusterAlreadyExistsFault | Notify the user that a cluster with this identifier already exists and exit. |
describeClusters | ClusterNotFoundFault | Notify the user that the specified cluster was not found. |
listDatabases | ValidationException | Notify the user that the cluster is not available or parameters are invalid. |
executeStatement | ValidationException | Notify the user of SQL syntax errors or invalid query parameters. |
describeStatement | ResourceNotFoundException | Notify the user that the statement ID was not found. |
getStatementResult | ResourceNotFoundException | Notify the user that results are not available for the statement ID. |
modifyCluster | ClusterNotFoundFault | Notify the user that the cluster to modify was not found. |
deleteCluster | ClusterNotFoundFault | Notify the user that the cluster to delete was not found. |
The following table describes the metadata used in this SDK Getting Started Scenario.
| action | metadata file | metadata key |
|---|---|---|
deleteCluster | redshift_metadata.yaml | redshift_DeleteCluster |
addrecord | redshift_metadata.yaml | redshift_Insert |
describeStatement | redshift_metadata.yaml | redshift_DescribeStatement |
modifyCluster | redshift_metadata.yaml | redshift_ModifyCluster |
querymoviesd | redshift_metadata.yaml | redshift_Query |
getStatementResult | redshift_metadata.yaml | redshift_GetStatementResult |
describeClusters | redshift_metadata.yaml | redshift_DescribeClusters |
createTable | redshift_metadata.yaml | redshift_CreateTable |
createCluster | redshift_metadata.yaml | redshift_CreateCluster |
describeClustersPaginator | redshift_metadata.yaml | redshift_Hello |
scenario | redshift_metadata.yaml | redshift_Scenario |