docs/content/stable/develop/build-global-apps/_index.md
{{<srcdiagram href="https://docs.google.com/presentation/d/1lEajQyVZLhmHRKmBxunf1LucWkQkrJ3rIthoHxZvyQc/edit#slide=id.g22bc5dd47b0_0_18">}}
Internet and cloud technologies have revolutionized the way people interact and operate. Cloud introduces the possibility of distributing and replicating data across multiple geographies. Accessing and maintaining that globally distributed data demands a new class of global application.
The reasons for making your applications global are the same as those for adopting a distributed database:
Business continuity and disaster recovery. Although public clouds have come a long way since the inception of AWS in 2006, region and zone outages are still fairly common, happening once or twice a year (see, for example, AWS outages and Google outages). To provide uninterrupted service to your users, you need to run your applications in multiple locations.
Data residency for compliance. To comply with data residency laws (for example, the GDPR), you need to ensure that the data of citizens is stored on servers located in their country. This means that you need to design your applications to split data across geographies accordingly.
Moving data closer to users. When designing an application with global reach (for example, email, e-commerce, or broadcasting events like the Olympics), you need to take into account where your users are located. If your application is hosted in data centers located in the US, users in Europe might encounter high latency when trying to access your application. To provide the best user experience, you need to run your applications closer to your users.
Running applications in multiple data centers with data split across them is not a trivial task. When designing global applications, you need to answer questions such as:
To help you answer these questions, use the following architectural concepts to choose a suitable design pattern for your application.
Depending on where the application instances run and which ones are active, choose from the following application architectures:
Depending on whether the application instances operate on the entire dataset or just a subset, and how the application moves on a fault domain failure, choose from the following availability architectures:
Depending on whether the application should read the latest data or stale data, choose from the following data access architectures:
Use the following matrix to choose a design pattern, based on the architectures described in the preceding section.
| Pattern Type | Follow the Application | Geo-Local Dataset |
|---|---|---|
| Single Active | Global database | |
| Active-active single-master | N/A | |
| Multi Active | Global database | |
| Duplicate indexes | Active-active multi-master | |
| Partitioned Multi Active | Latency-optimized geo-partitioning | Locality-optimized geo-partitioning |
| Data Access Architecture | Consistent Reads | |
| Follower Reads | ||
| Read Replicas |
The following table summarizes the design patterns that you can use for your applications. Use these proven patterns to address common problems and accelerate your application development.
{{<table>}}
| Pattern Name | Description |
|---|---|
| Global database | |
| {{<header Level="6">}} Single database spread across multiple regions {{</header>}} | |
| A database spread across multiple (3 or more) regions or zones. On failure, a replica in another region/zone will be promoted to leader in seconds, without any loss of data. Applications read from source of truth, possibly with higher latencies. |
|Duplicate indexes| {{<header Level="6">}} Consistent data everywhere {{</header>}} Set up covering indexes with schema the same as the table in multiple regions to read immediately consistent data locally.|
|Active‑active single‑master| {{<header Level="6">}} Secondary database that can serve reads {{</header>}} Set up a second cluster that gets populated asynchronously and can start serving data in case the primary fails. Can also be used for blue-green deployment testing.|
|Active‑active multi‑master| {{<header Level="6">}} Two clusters serving data together {{</header>}} Two regions or more, manual failover, a few seconds of data loss (non-zero RPO), low read/write latencies, some caveats on transactional guarantees.|
|Latency‑optimized geo‑partitioning| {{<header Level="6">}} Fast local access {{</header>}} Partition your data and place it such that the data belonging to nearby users can be accessed faster.|
|Locality‑optimized geo‑partitioning| {{<header Level="6">}} Local law compliance {{</header>}} Partition your data and place it such that the rows belonging to different users are located in their respective countries.|
|Follower Reads | {{<header Level="6">}} Fast, stale reads {{</header>}} Read from local followers instead of going to the leaders in a different region.|
|Read Replicas | {{<header Level="6">}} Fast reads from a read-only cluster {{</header>}} Set up a separate cluster of just followers to perform local reads instead of going to the leaders in a different region.|
{{</table>}}
All the illustrations in this section use the following legend to represent tablet leaders and followers, cloud regions and zones, and applications.