docs/content/stable/integrations/presto.md
Presto is a distributed SQL query engine optimized for ad-hoc analysis at interactive speed. It supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions. It has a connector architecture to query data from many data sources.
This document describes how to set up Presto to query YugabyteDB's YCQL tables.
Follow the Quick start instructions to run a local YugabyteDB cluster. Test YugabyteDB's Cassandra-compatible API, as documented so that you can confirm that you have a Cassandra-compatible service running on localhost:9042. Ensure that you have created the keyspace and table, and inserted sample data as described there.
Detailed steps are documented here. The following are the minimal steps for getting started:
$ wget https://repo1.maven.org/maven2/io/prestosql/presto-server/309/presto-server-309.tar.gz
$ tar xvf presto-server-309.tar.gz
$ cd presto-server-309
$ mkdir etc
$ mkdir etc/catalog
$ mkdir data
$ cat > etc/node.properties
node.environment=test
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff
node.data-dir=/Users/<username>/presto-server-309/data
Press Ctrl-D after you have pasted the file contents.
$ cat > etc/jvm.config
-server
-Xmx6G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
Press Ctrl-D after you have pasted the file contents.
$ cat > etc/config.properties
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=4GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://localhost:8080
Press Ctrl-D after you have pasted the file contents.
$ cat > etc/log.properties
io.prestosql=INFO
Press Ctrl-D after you have pasted the file contents.
Create the Cassandra catalog properties file in etc/catalog directory.
Detailed instructions are here.
$ cat > etc/catalog/cassandra.properties
connector.name=cassandra
cassandra.contact-points=127.0.0.1
Press Ctrl-D after you have pasted the file contents.
$ cd ~/presto-server-309/bin
$ wget https://repo1.maven.org/maven2/io/prestosql/presto-cli/309/presto-cli-309-executable.jar
Rename the JAR file to presto. It is meant to be a self-running binary.
$ mv presto-cli-309-executable.jar presto && chmod +x presto
$ cd ~/presto-server-309
To run in foreground mode:
$ ./bin/launcher run
To run in background mode:
$ ./bin/launcher start
Use the presto CLI to run ad-hoc queries:
$ ./bin/presto --server localhost:8080 --catalog cassandra --schema default
Start using myapp:
presto:default> use myapp;
USE
Show the tables available:
presto:myapp> show tables;
Table
-------
stock_market
(1 row)
Describe a particular table:
presto:myapp> describe stock_market;
Column | Type | Extra | Comment
---------------+---------+-------+---------
stock_symbol | varchar | |
ts | varchar | |
current_price | real | |
(3 rows)
presto:myapp> select * from stock_market where stock_symbol = 'AAPL';
stock_symbol | ts | current_price
--------------+---------------------+---------------
AAPL | 2017-10-26 09:00:00 | 157.41
AAPL | 2017-10-26 10:00:00 | 157.0
(2 rows)
presto:myapp> select stock_symbol, avg(current_price) from stock_market group by stock_symbol;
stock_symbol | _col1
--------------+---------
GOOG | 972.235
AAPL | 157.205
FB | 170.365
(3 rows)