docs/sql-ref-syntax-aux-cache-cache-table.md
CACHE TABLE statement caches contents of a table or output of a query with the given storage level. If a query is cached, then a temp view will be created for this query.
This reduces scanning of the original files in future queries.
Note: Cached data is shared across all Spark sessions on the cluster.
CACHE [ LAZY ] TABLE table_identifier
[ OPTIONS ( 'storageLevel' [ = ] value ) ] [ [ AS ] query ]
LAZY
Only cache the table when it is first used, instead of immediately.
table_identifier
Specifies the table or view name to be cached. The table or view name may be optionally qualified with a database name.
Syntax: [ database_name. ] table_name
OPTIONS ( 'storageLevel' [ = ] value )
OPTIONS clause with storageLevel key and value pair. A Warning is issued when a key other than storageLevel is used. The valid options for storageLevel are:
NONEDISK_ONLYDISK_ONLY_2DISK_ONLY_3MEMORY_ONLYMEMORY_ONLY_2MEMORY_ONLY_SERMEMORY_ONLY_SER_2MEMORY_AND_DISKMEMORY_AND_DISK_2MEMORY_AND_DISK_SERMEMORY_AND_DISK_SER_2OFF_HEAPAn Exception is thrown when an invalid value is set for storageLevel. If storageLevel is not explicitly set using OPTIONS clause, the default storageLevel is set to MEMORY_AND_DISK.
query
A query that produces the rows to be cached. It can be in one of following formats:
SELECT statementTABLE statementFROM statementCACHE TABLE testCache OPTIONS ('storageLevel' 'DISK_ONLY') SELECT * FROM testData;