Introduction
In today’s global digital economy, application downtime translates directly into lost revenue, damaged brand reputation, and frustrated users. A single-region deployment — no matter how well-architected — represents a single point of failure. Natural disasters, network outages, or cloud provider regional failures can take your entire application offline.
In this tutorial, you’ll learn how to build a multi-region AWS architecture using AWS Route53 (for global DNS routing with health checks and failover) and
CloudFront (for content delivery and edge caching). By the end, you’ll have a production-ready, resilient architecture that automatically routes users to the nearest healthy region.
Prerequisites: An AWS account with administrative access,
What Is Multi-Region Architecture?
A multi-region architecture deploys your application workload across two or more geographically distinct AWS regions. This provides:
- High Availability (HA): If one region goes down, traffic is rerouted to a healthy region automatically.
- Disaster Recovery (DR): RPO (Recovery Point Objective) and RTO (Recovery Time Objective) are measured in minutes, not hours.
- Global Latency Optimization: Users are routed to the nearest region, reducing round-trip time.
- Compliance & Data Sovereignty: Keep data in specific geographic boundaries for regulatory requirements.
The core services that enable this architecture are:
Amazon Route53 — A highly available and scalable DNS web service with latency-based routing and health checks that automatically fail over between regions.
Amazon CloudFront — A global content delivery network (CDN) that caches content at 600+ edge locations and can origin-failover between primary and secondary regions.
Amazon S3 — Cross-Region Replication (CRR) for stateful data and static assets.
Architecture Overview
The following diagram shows the complete multi-region architecture we’ll implement. Users connect via Route53 with latency-based routing, which directs them to the closest healthy region. CloudFront sits in front as a CDN layer, caching static content at edge locations and providing origin failover. S3 Cross-Region Replication (CRR) keeps static assets and state synchronized between regions.
Key Architectural Decision: Latency-based routing at Route53 works best when combined with health checks. If a region’s health check fails, Route53 automatically excludes that region from DNS responses — users are routed to the next best healthy region without any action from the user.
Step 1: Setup — Configure the Base Infrastructure
We’ll use Terraform to provision our infrastructure. First, create the provider configuration and the foundational networking components.
1.1 Provider Configuration
# providers.tf
terraform {
required_version = ">= 1.5"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
alias = "primary"
region = "us-east-1"
}
provider "aws" {
alias = "secondary"
region = "eu-west-1"
}
1.2 S3 Buckets with Cross-Region Replication
# s3-crr.tf
resource "aws_s3_bucket" "primary" {
provider = aws.primary
bucket = "myapp-static-assets-primary"
}
resource "aws_s3_bucket" "secondary" {
provider = aws.secondary
bucket = "myapp-static-assets-secondary"
}
resource "aws_s3_bucket_replication_configuration" "crr" {
provider = aws.primary
bucket = aws_s3_bucket.primary.id
role = aws_iam_role.s3_replication.arn
rule {
status = "Enabled"
destination {
bucket = aws_s3_bucket.secondary.arn
storage_class = "STANDARD"
}
}
}
Best Practice: Enable S3 Versioning on both buckets before configuring CRR. This ensures that if an object is accidentally deleted in the primary region, the replicated version in the secondary region is preserved.
Step 2: Deploy Application in Both Regions
Now we’ll deploy identical application stacks in both regions. Each stack consists of a VPC, an Application Load Balancer (ALB), and an ECS Fargate service running your application containers. Using Terraform modules keeps this DRY.
# app-stack.tf
module "app_primary" {
source = "./modules/app-stack"
providers = { aws = aws.primary }
region = "us-east-1"
vpc_cidr = "10.0.0.0/16"
app_port = 8080
}
module "app_secondary" {
source = "./modules/app-stack"
providers = { aws = aws.secondary }
region = "eu-west-1"
vpc_cidr = "10.1.0.0/16"
app_port = 8080
}
Key Configuration for the ALB Health Check
# modules/app-stack/main.tf (excerpt)
resource "aws_lb_target_group" "app" {
name = "app-tg-${var.region}"
port = var.app_port
protocol = "HTTP"
vpc_id = aws_vpc.main.id
health_check {
enabled = true
healthy_threshold = 2
unhealthy_threshold = 3
interval = 10
path = "/health"
timeout = 5
}
}
Warning: Your application
/health endpoint must check dependencies (database connectivity, cache reachability, etc.) and return HTTP 200 only when truly healthy. A shallow “OK” response will defeat the purpose of multi-region failover.
Step 3: Configure Route53 Latency-Based Routing
This is the heart of our multi-region setup. Route53’s latency-based routing directs each user to the AWS region that provides the lowest latency for that user. Combined with health checks, it automatically fails over when a region becomes unhealthy.
# route53.tf
resource "aws_route53_health_check" "primary" {
provider = aws.primary
fqdn = module.app_primary.alb_dns_name
port = 443
type = "HTTPS"
resource_path = "/health"
failure_threshold = 3
request_interval = 10
measure_latency = true
tags = { Name = "primary-region-health-check" }
}
resource "aws_route53_health_check" "secondary" {
provider = aws.secondary
fqdn = module.app_secondary.alb_dns_name
port = 443
type = "HTTPS"
resource_path = "/health"
failure_threshold = 3
request_interval = 10
tags = { Name = "secondary-region-health-check" }
}
resource "aws_route53_record" "app" {
zone_id = aws_route53_zone.main.zone_id
name = "app.mydomain.com"
type = "A"
set_identifier = "primary"
latency_routing_policy {
region = "us-east-1"
}
alias {
name = module.app_primary.alb_dns_name
zone_id = module.app_primary.alb_zone_id
evaluate_target_health = true
}
health_check_id = aws_route53_health_check.primary.id
failover_routing_policy {
type = "PRIMARY"
}
}
resource "aws_route53_record" "app_secondary" {
zone_id = aws_route53_zone.main.zone_id
name = "app.mydomain.com"
type = "A"
set_identifier = "secondary"
latency_routing_policy {
region = "eu-west-1"
}
alias {
name = module.app_secondary.alb_dns_name
zone_id = module.app_secondary.alb_zone_id
evaluate_target_health = true
}
health_check_id = aws_route53_health_check.secondary.id
failover_routing_policy {
type = "SECONDARY"
}
}
Step 4: Configure CloudFront with Origin Failover
CloudFront adds an additional layer of resilience. With origin failover, CloudFront can automatically switch to a secondary origin group if the primary returns HTTP 5xx or becomes unreachable. This catches failures that Route53’s health check interval might miss.
# cloudfront.tf
resource "aws_cloudfront_distribution" "cdn" {
enabled = true
price_class = "PriceClass_All"
aliases = ["app.mydomain.com"]
origin_group {
primary_origin_id = "primary-region"
failover_criteria {
status_codes = [500, 502, 503, 504]
}
}
origin {
domain_name = module.app_primary.alb_dns_name
origin_id = "primary-region"
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "https-only"
origin_ssl_protocols = ["TLSv1.2"]
}
}
origin {
domain_name = module.app_secondary.alb_dns_name
origin_id = "secondary-region"
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "https-only"
origin_ssl_protocols = ["TLSv1.2"]
}
}
default_cache_behavior {
target_origin_id = "primary-region"
viewer_protocol_policy = "redirect-to-https"
allowed_methods = ["GET", "HEAD", "OPTIONS", "PUT", "POST", "PATCH", "DELETE"]
cached_methods = ["GET", "HEAD"]
forwarded_values {
query_string = true
cookies {
forward = "all"
}
}
min_ttl = 0
default_ttl = 60
max_ttl = 300
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.cdn.arn
ssl_support_method = "sni-only"
}
}
Pro Tip: Set
min_ttl to 0 for dynamic API responses so CloudFront doesn’t serve stale content during failover. Use a separate cache behavior with longer TTLs for static assets (/static/*, /images/*).
Step 5: Database Strategy — Multi-Region RDS
Stateful services like databases are the hardest part of multi-region architecture. Here’s a practical approach using RDS:
- Primary Region: RDS Multi-AZ deployment (synchronous replication within region for high availability).
- Secondary Region: RDS Read Replica (asynchronous cross-region replication for disaster recovery).
- Application Logic: Writes go to primary; reads can go to the read replica for latency improvement.
- Failover Plan: Promote the read replica to a standalone instance during a region failure.
# rds.tf
resource "aws_db_instance" "primary" {
provider = aws.primary
identifier = "myapp-db-primary"
engine = "postgres"
engine_version = "16.3"
instance_class = "db.r6g.large"
multi_az = true
db_name = "myapp"
username = "admin"
backup_retention_period = 7
storage_encrypted = true
}
resource "aws_db_instance" "read_replica" {
provider = aws.secondary
identifier = "myapp-db-secondary"
replicate_source_db = aws_db_instance.primary.arn
instance_class = "db.r6g.large"
backup_retention_period = 7
storage_encrypted = true
}
Step 6: Verification — Testing the Failover
After applying your Terraform configuration with terraform apply, it’s time to validate the setup. Here’s how to test each layer:
6.1 Test DNS Resolution
# Query Route53 to verify latency-based routing
dig app.mydomain.com
# Check which region you're being routed to
curl -v https://app.mydomain.com/api/region
6.2 Simulate a Region Failure
# Block all traffic to the primary ALB (simulate outage)
aws ec2 create-network-acl-entry \
--region us-east-1 \
--network-acl-id $(aws ec2 describe-network-acls \
--region us-east-1 \
--filters Name=tag:Name,Values=myapp-app-stack \
--query 'NetworkAcls[0].NetworkAclId' \
--output text) \
--ingress --rule-number 500 \
--protocol -1 --cidr-block 0.0.0.0/0 \
--deny
# Wait 30 seconds for health check to fail
sleep 30
# Verify failover — should now resolve to eu-west-1
dig app.mydomain.com
# Application should still be accessible
curl -s -o /dev/null -w "%{http_code}" https://app.mydomain.com
6.3 Monitor Health Check Status
# Check Route53 health check status
aws route53 get-health-check-status \
--health-check-id $(aws route53 list-health-checks \
--query 'HealthChecks[?Tags[?Key==`Name`&&Value==`primary-region-health-check`]].Id' \
--output text)
Important: Always test failover in a staging environment first. Real user traffic should never be the first to exercise your disaster recovery plan. Schedule quarterly “Game Days” where you deliberately fail a region to validate your architecture.
Conclusion
Building a multi-region architecture on AWS with Route53 and CloudFront is one of the most effective ways to ensure your application remains highly available and resilient to regional outages. In this tutorial, you learned:
- How Route53 latency-based routing with health checks provides automatic DNS-level failover between regions.
- How CloudFront origin failover catches application-level errors at the CDN edge.
- How S3 Cross-Region Replication keeps static assets synchronized.
- A practical database strategy using RDS Multi-AZ and cross-region read replicas.
- How to verify and test the entire failover mechanism end-to-end.
Next Steps:
- Implement
CloudWatch alarms to notify your team when failover events occur.
- Set up
AWS Lambda automation to promote the read replica during automated failover.
- Explore Global Accelerator for TCP/UDP workloads that can’t use DNS-based routing.
- Consider AWS Resilience Hub to assess and improve your overall resilience posture.
Remember: Multi-region architecture doubles your infrastructure costs. Only implement it for workloads where the business impact of downtime justifies the additional expense. For many applications, a well-designed single-region Multi-AZ deployment provides sufficient availability at a fraction of the cost.