docs/en/overview/Use-Cases.md
Many leading companies around the world run Alluxio in production to extract value from their data. Some of them are listed in our Powered-By page. In this section, we will introduce some of the most common Alluxio use cases.
Many organizations are running analytics and machine learning workloads (Spark, Presto, Hive, Tensorflow, etc.) on object storage in the public cloud (AWS S3, Google Cloud, or Microsoft Azure).
Though cloud object stores are often more cost-effective, easier to use, and easier to scale, there are some challenges:
Alluxio addresses these challenges by providing intelligent multi-tiered caching and metadata management. Deploying Alluxio on the compute cluster helps:
See this example use case from Electronic Arts.
Running data-driven applications on top of an object store deployed on-premise brings the following challenges:
Alluxio solves these problems by providing caching and API translation. Deploying Alluxio on the application side brings:
See this example use case from DBS.
As more organizations are migrating to the cloud, a common intermediate step is to utilize compute resources in the cloud while retrieving data from on-premise data sources. However, this hybrid architecture brings the following problems:
Alluxio provides "zero-copy" cloud bursting which enables compute engines in the cloud to access data on-premise without the need of a persistent copy of the data in the cloud that needs to be periodically synchronized to the original data on-premises. This brings the following benefits:
See this example use case from Walmart.
Another hybrid cloud architecture is to access cloud storage from a private datacenter. Using this architecture usually causes the following problems:
Alluxio solves these problems by acting as a hybrid cloud storage gateway that utilizes on-premise compute for data in the cloud. When deployed with the compute on-premise, Alluxio manages the compute cluster’s storage and provides data locality to applications, achieving:
See this example use case from Comcast.
Many organizations maintain satellite compute clusters that are independent of their main data cluster for the purposes of performance, security, or resource isolation. These satellite clusters need to access data remotely from the main cluster, which is challenging because:
Alluxio can be deployed on the compute nodes in the satellite cluster and configured to connect to the main data cluster, serving as one logical copy of data. Thus: