Building Multi-Region AWS Architectures with Route53 and CloudFront for High Availability

Introduction

In today’s global digital economy, application downtime translates directly into lost revenue, damaged brand reputation, and frustrated users. A single-region deployment — no matter how well-architected — represents a single point of failure. Natural disasters, network outages, or cloud provider regional failures can take your entire application offline.

In this tutorial, you’ll learn how to build a multi-region AWS architecture using AWS AWS Route53 (for global DNS routing with health checks and failover) and CloudFront CloudFront (for content delivery and edge caching). By the end, you’ll have a production-ready, resilient architecture that automatically routes users to the nearest healthy region.

!
Prerequisites: An AWS account with administrative access, Terraform Terraform installed (v1.5+), and basic familiarity with VPCs, EC2, and DNS concepts.

What Is Multi-Region Architecture?

A multi-region architecture deploys your application workload across two or more geographically distinct AWS regions. This provides:

  • High Availability (HA): If one region goes down, traffic is rerouted to a healthy region automatically.
  • Disaster Recovery (DR): RPO (Recovery Point Objective) and RTO (Recovery Time Objective) are measured in minutes, not hours.
  • Global Latency Optimization: Users are routed to the nearest region, reducing round-trip time.
  • Compliance & Data Sovereignty: Keep data in specific geographic boundaries for regulatory requirements.

The core services that enable this architecture are:

  • Route53 Amazon Route53 — A highly available and scalable DNS web service with latency-based routing and health checks that automatically fail over between regions.
  • CloudFront Amazon CloudFront — A global content delivery network (CDN) that caches content at 600+ edge locations and can origin-failover between primary and secondary regions.
  • S3 Amazon S3 — Cross-Region Replication (CRR) for stateful data and static assets.


Core Multi-Region Components

Route53 Latency Routing + Health Checks

CloudFront CDN + Origin Failover

S3 Cross-Region Replication Stateful Data Sync

Primary Region (us-east-1) Application Load Balancer EC2 / ECS / Lambda

Secondary Region (eu-west-1) Application Load Balancer EC2 / ECS / Lambda

latency-based failover

Architecture Overview

The following diagram shows the complete multi-region architecture we’ll implement. Users connect via Route53 with latency-based routing, which directs them to the closest healthy region. CloudFront sits in front as a CDN layer, caching static content at edge locations and providing origin failover. S3 Cross-Region Replication (CRR) keeps static assets and state synchronized between regions.

Multi-Region Architecture — Request Flow

🌐 Global Users

Route53 Latency-Based Routing + Health Checks

DNS query

CloudFront CDN Edge Caching + Origin Failover

routes to edge

Primary Region (us-east-1) ALB → ECS (App + API) RDS Multi-AZ + ElastiCache S3 Bucket (Primary)

Secondary Region (eu-west-1) ALB → ECS (App + API) RDS Read Replica + ElastiCache S3 Bucket (Secondary)

primary origin

failover origin

S3 Cross-Region Replication (CRR)

i
Key Architectural Decision: Latency-based routing at Route53 works best when combined with health checks. If a region’s health check fails, Route53 automatically excludes that region from DNS responses — users are routed to the next best healthy region without any action from the user.

Step 1: Setup — Configure the Base Infrastructure

We’ll use Terraform Terraform to provision our infrastructure. First, create the provider configuration and the foundational networking components.

1.1 Provider Configuration

# providers.tf
terraform {
  required_version = ">= 1.5"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  alias  = "primary"
  region = "us-east-1"
}

provider "aws" {
  alias  = "secondary"
  region = "eu-west-1"
}

1.2 S3 Buckets with Cross-Region Replication

# s3-crr.tf
resource "aws_s3_bucket" "primary" {
  provider = aws.primary
  bucket   = "myapp-static-assets-primary"
}

resource "aws_s3_bucket" "secondary" {
  provider = aws.secondary
  bucket   = "myapp-static-assets-secondary"
}

resource "aws_s3_bucket_replication_configuration" "crr" {
  provider   = aws.primary
  bucket     = aws_s3_bucket.primary.id
  role       = aws_iam_role.s3_replication.arn

  rule {
    status = "Enabled"
    destination {
      bucket        = aws_s3_bucket.secondary.arn
      storage_class = "STANDARD"
    }
  }
}

Best Practice: Enable S3 Versioning on both buckets before configuring CRR. This ensures that if an object is accidentally deleted in the primary region, the replicated version in the secondary region is preserved.

Step 2: Deploy Application in Both Regions

Now we’ll deploy identical application stacks in both regions. Each stack consists of a VPC, an Application Load Balancer (ALB), and an ECS Fargate service running your application containers. Using Terraform modules keeps this DRY.

# app-stack.tf
module "app_primary" {
  source   = "./modules/app-stack"
  providers = { aws = aws.primary }
  region   = "us-east-1"
  vpc_cidr = "10.0.0.0/16"
  app_port = 8080
}

module "app_secondary" {
  source   = "./modules/app-stack"
  providers = { aws = aws.secondary }
  region   = "eu-west-1"
  vpc_cidr = "10.1.0.0/16"
  app_port = 8080
}

Key Configuration for the ALB Health Check

# modules/app-stack/main.tf (excerpt)
resource "aws_lb_target_group" "app" {
  name     = "app-tg-${var.region}"
  port     = var.app_port
  protocol = "HTTP"
  vpc_id   = aws_vpc.main.id

  health_check {
    enabled             = true
    healthy_threshold   = 2
    unhealthy_threshold = 3
    interval            = 10
    path                = "/health"
    timeout             = 5
  }
}
!
Warning: Your application /health endpoint must check dependencies (database connectivity, cache reachability, etc.) and return HTTP 200 only when truly healthy. A shallow “OK” response will defeat the purpose of multi-region failover.

Step 3: Configure Route53 Latency-Based Routing

This is the heart of our multi-region setup. Route53’s latency-based routing directs each user to the AWS region that provides the lowest latency for that user. Combined with health checks, it automatically fails over when a region becomes unhealthy.

# route53.tf
resource "aws_route53_health_check" "primary" {
  provider                = aws.primary
  fqdn                    = module.app_primary.alb_dns_name
  port                    = 443
  type                    = "HTTPS"
  resource_path           = "/health"
  failure_threshold       = 3
  request_interval        = 10
  measure_latency         = true
  
  tags = { Name = "primary-region-health-check" }
}

resource "aws_route53_health_check" "secondary" {
  provider          = aws.secondary
  fqdn              = module.app_secondary.alb_dns_name
  port              = 443
  type              = "HTTPS"
  resource_path     = "/health"
  failure_threshold = 3
  request_interval  = 10
  
  tags = { Name = "secondary-region-health-check" }
}

resource "aws_route53_record" "app" {
  zone_id = aws_route53_zone.main.zone_id
  name    = "app.mydomain.com"
  type    = "A"
  
  set_identifier = "primary"
  latency_routing_policy {
    region = "us-east-1"
  }
  alias {
    name                   = module.app_primary.alb_dns_name
    zone_id                = module.app_primary.alb_zone_id
    evaluate_target_health = true
  }
  health_check_id = aws_route53_health_check.primary.id
  failover_routing_policy {
    type = "PRIMARY"
  }
}

resource "aws_route53_record" "app_secondary" {
  zone_id = aws_route53_zone.main.zone_id
  name    = "app.mydomain.com"
  type    = "A"
  
  set_identifier = "secondary"
  latency_routing_policy {
    region = "eu-west-1"
  }
  alias {
    name                   = module.app_secondary.alb_dns_name
    zone_id                = module.app_secondary.alb_zone_id
    evaluate_target_health = true
  }
  health_check_id = aws_route53_health_check.secondary.id
  failover_routing_policy {
    type = "SECONDARY"
  }
}


Route53 Failover Decision Flow

User DNS Query for app.mydomain.com

Health Check Both Regions Healthy?

YES

Latency-Based Routing

Route to Lowest Latency Region

NO

Failover Routing

Route to Healthy Region Only

User Connects to Region

Step 4: Configure CloudFront with Origin Failover

CloudFront CloudFront adds an additional layer of resilience. With origin failover, CloudFront can automatically switch to a secondary origin group if the primary returns HTTP 5xx or becomes unreachable. This catches failures that Route53’s health check interval might miss.

# cloudfront.tf
resource "aws_cloudfront_distribution" "cdn" {
  enabled     = true
  price_class = "PriceClass_All"
  aliases     = ["app.mydomain.com"]

  origin_group {
    primary_origin_id  = "primary-region"
    failover_criteria {
      status_codes = [500, 502, 503, 504]
    }
  }

  origin {
    domain_name = module.app_primary.alb_dns_name
    origin_id   = "primary-region"
    custom_origin_config {
      http_port              = 80
      https_port             = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }
  }

  origin {
    domain_name = module.app_secondary.alb_dns_name
    origin_id   = "secondary-region"
    custom_origin_config {
      http_port              = 80
      https_port             = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }
  }

  default_cache_behavior {
    target_origin_id       = "primary-region"
    viewer_protocol_policy = "redirect-to-https"
    allowed_methods        = ["GET", "HEAD", "OPTIONS", "PUT", "POST", "PATCH", "DELETE"]
    cached_methods         = ["GET", "HEAD"]
    
    forwarded_values {
      query_string = true
      cookies {
        forward = "all"
      }
    }
    min_ttl     = 0
    default_ttl = 60
    max_ttl     = 300
  }

  viewer_certificate {
    acm_certificate_arn = aws_acm_certificate.cdn.arn
    ssl_support_method  = "sni-only"
  }
}

Pro Tip: Set min_ttl to 0 for dynamic API responses so CloudFront doesn’t serve stale content during failover. Use a separate cache behavior with longer TTLs for static assets (/static/*, /images/*).

Step 5: Database Strategy — Multi-Region RDS

Stateful services like databases are the hardest part of multi-region architecture. Here’s a practical approach using RDS RDS:

  1. Primary Region: RDS Multi-AZ deployment (synchronous replication within region for high availability).
  2. Secondary Region: RDS Read Replica (asynchronous cross-region replication for disaster recovery).
  3. Application Logic: Writes go to primary; reads can go to the read replica for latency improvement.
  4. Failover Plan: Promote the read replica to a standalone instance during a region failure.
# rds.tf
resource "aws_db_instance" "primary" {
  provider             = aws.primary
  identifier           = "myapp-db-primary"
  engine               = "postgres"
  engine_version       = "16.3"
  instance_class       = "db.r6g.large"
  multi_az             = true
  db_name              = "myapp"
  username             = "admin"
  backup_retention_period = 7
  storage_encrypted    = true
}

resource "aws_db_instance" "read_replica" {
  provider             = aws.secondary
  identifier           = "myapp-db-secondary"
  replicate_source_db  = aws_db_instance.primary.arn
  instance_class       = "db.r6g.large"
  backup_retention_period = 7
  storage_encrypted    = true
}

Step 6: Verification — Testing the Failover

After applying your Terraform configuration with terraform apply, it’s time to validate the setup. Here’s how to test each layer:

6.1 Test DNS Resolution

# Query Route53 to verify latency-based routing
dig app.mydomain.com

# Check which region you're being routed to
curl -v https://app.mydomain.com/api/region

6.2 Simulate a Region Failure

# Block all traffic to the primary ALB (simulate outage)
aws ec2 create-network-acl-entry \
  --region us-east-1 \
  --network-acl-id $(aws ec2 describe-network-acls \
    --region us-east-1 \
    --filters Name=tag:Name,Values=myapp-app-stack \
    --query 'NetworkAcls[0].NetworkAclId' \
    --output text) \
  --ingress --rule-number 500 \
  --protocol -1 --cidr-block 0.0.0.0/0 \
  --deny

# Wait 30 seconds for health check to fail
sleep 30

# Verify failover — should now resolve to eu-west-1
dig app.mydomain.com

# Application should still be accessible
curl -s -o /dev/null -w "%{http_code}" https://app.mydomain.com

6.3 Monitor Health Check Status

# Check Route53 health check status
aws route53 get-health-check-status \
  --health-check-id $(aws route53 list-health-checks \
    --query 'HealthChecks[?Tags[?Key==`Name`&&Value==`primary-region-health-check`]].Id' \
    --output text)

Important: Always test failover in a staging environment first. Real user traffic should never be the first to exercise your disaster recovery plan. Schedule quarterly “Game Days” where you deliberately fail a region to validate your architecture.

Conclusion

Building a multi-region architecture on AWS with Route53 and CloudFront is one of the most effective ways to ensure your application remains highly available and resilient to regional outages. In this tutorial, you learned:

  • How Route53 latency-based routing with health checks provides automatic DNS-level failover between regions.
  • How CloudFront origin failover catches application-level errors at the CDN edge.
  • How S3 Cross-Region Replication keeps static assets synchronized.
  • A practical database strategy using RDS Multi-AZ and cross-region read replicas.
  • How to verify and test the entire failover mechanism end-to-end.

Next Steps:

  • Implement CloudWatch CloudWatch alarms to notify your team when failover events occur.
  • Set up Lambda AWS Lambda automation to promote the read replica during automated failover.
  • Explore Global Accelerator for TCP/UDP workloads that can’t use DNS-based routing.
  • Consider AWS Resilience Hub to assess and improve your overall resilience posture.
i
Remember: Multi-region architecture doubles your infrastructure costs. Only implement it for workloads where the business impact of downtime justifies the additional expense. For many applications, a well-designed single-region Multi-AZ deployment provides sufficient availability at a fraction of the cost.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top