Skip to main content
  1. The AWS Mastery Question Bank: Architect Decision Matrix Hub/
  2. SAP-C02/

AWS SAP-C02 Drill: Multi-Region API Failover - The Disaster Recovery Trade-off Analysis

Jeff Taakey
Author
Jeff Taakey
21+ Year Enterprise Architect | AWS SAA/SAP & Multi-Cloud Expert.

Jeff’s Insights
#

“Unlike generic exam dumps, Jeff’s Insights is designed to make you think like a Real-World Production Architect. We dissect this scenario by analyzing the strategic trade-offs required to balance operational reliability, security, and long-term cost across multi-service deployments.”

While preparing for the AWS SAP-C02, many candidates get confused by API Gateway endpoint types and multi-region deployment patterns. In the real world, this is fundamentally a decision about RPO/RTO requirements vs. operational complexity and cost. Let’s drill into a simulated scenario.

The Architecture Drill (Simulated Question)
#

Scenario
#

SkyMetrics Inc., a meteorological data provider, operates a REST API serving real-time weather analytics to enterprise clients across North America. The API infrastructure runs on Amazon API Gateway with custom domain analytics.skymetrics.io managed through Route 53. Each API endpoint invokes a dedicated AWS Lambda function, and all telemetry data resides in an Amazon DynamoDB table in us-east-1.

The CTO has mandated a cross-region disaster recovery capability following a recent outage that affected their primary region. The solution must ensure automatic failover to a secondary AWS region while maintaining data consistency and minimizing client-side configuration changes.

The Requirement:
#

Design a multi-region failover architecture for the REST API that:

  • Ensures automatic DNS-based failover
  • Maintains data consistency across regions
  • Requires no client application changes
  • Minimizes RTO (Recovery Time Objective)

The Options
#

  • A) Deploy a new set of Lambda functions in a secondary region; Update the API Gateway API to use an edge-optimized endpoint with Lambda functions from both regions as targets; Convert the DynamoDB table to a global table.

  • B) Deploy a new API Gateway API and Lambda functions in a secondary region; Modify the Route 53 DNS record to a multivalue answer record; Add both API Gateway APIs to the answer list; Enable target health checks; Convert the DynamoDB table to a global table.

  • C) Deploy a new API Gateway API and Lambda functions in a secondary region; Modify the Route 53 DNS record to a failover record; Enable target health checks; Convert the DynamoDB table to a global table.

  • D) Deploy a new API Gateway API in a secondary region; Modify Lambda functions to be global functions; Modify the Route 53 DNS record to a multivalue answer record; Add both API Gateway APIs to the answer list; Enable target health checks; Convert the DynamoDB table to a global table.


Correct Answer
#

Option C.


The Architect’s Analysis
#

Correct Answer
#

Option C — Regional API Gateway + Route 53 Failover + DynamoDB Global Tables.

The Winning Logic
#

This solution represents the optimal balance between disaster recovery capability, cost efficiency, and operational simplicity:

  1. Complete Regional Independence: Deploying a full API Gateway API + Lambda stack in the secondary region ensures no cross-region dependencies during failover. The primary region failure doesn’t impact secondary region operation.

  2. DNS-Based Failover: Route 53 failover routing policy provides automatic, health-check-driven DNS failover. Primary endpoint serves all traffic during normal operations; secondary activates only upon health check failure. This is the standard DR pattern for API workloads.

  3. Data Layer Consistency: DynamoDB Global Tables provide multi-region active-active replication with typically sub-second latency, ensuring the secondary region has up-to-date data when it takes over.

  4. Cost Optimization: Unlike active-active patterns, this keeps the secondary region in warm standby mode. You pay for:

    • API Gateway monthly fees (~$1/month per API)
    • Lambda provisioned concurrency (optional, for faster cold starts)
    • DynamoDB global table replication (write capacity units)

    But you don’t pay for API Gateway request charges on the secondary until failover occurs.

  5. Zero Client Impact: Custom domain with Route 53 means clients always call analytics.skymetrics.io—DNS handles the regional resolution transparently.

The Trap (Distractor Analysis)
#

Why not Option A?

  • Fatal Misconception: API Gateway edge-optimized endpoints use CloudFront distribution for global edge caching, but they cannot invoke Lambda functions across multiple regions. Edge-optimized endpoints still invoke backend integrations in a single region.
  • Lambda functions are always regional resources—there’s no multi-region invocation capability within a single API Gateway deployment.
  • This option fundamentally misunderstands API Gateway endpoint types.

Why not Option B?

  • Multivalue answer routing returns multiple IP addresses randomly to clients—it’s designed for simple load distribution, not failover.
  • It lacks health-check-based automatic routing. If the primary region fails, Route 53 will still return its IP address 50% of the time (assuming two values), causing 50% failure rate for clients.
  • For DR scenarios requiring automatic failover, you need failover or geoproximity with health checks, not multivalue.

Why not Option D?

  • “Global Lambda functions” don’t exist. Lambda is a regional service. While you can use Lambda@Edge (which runs at CloudFront edge locations), it’s designed for lightweight request/response manipulation, not as a replacement for regional API backends.
  • Same multivalue routing issue as Option B—no automatic failover.
  • This option contains a conceptual error that should immediately disqualify it.

The Architect Blueprint
#

graph TB subgraph "Client Layer" Client[Enterprise API Clients] end subgraph "DNS Layer - Route 53" R53[analytics.skymetrics.io<br/>Failover Routing Policy] HC1[Health Check - Primary] HC2[Health Check - Secondary] end subgraph "Primary Region - us-east-1" APIGW1[API Gateway<br/>Regional Endpoint] Lambda1A[Lambda: GetWeather] Lambda1B[Lambda: GetForecast] Lambda1C[Lambda: GetAlerts] DDB1[(DynamoDB Table<br/>Global Table - Primary)] end subgraph "Secondary Region - us-west-2" APIGW2[API Gateway<br/>Regional Endpoint] Lambda2A[Lambda: GetWeather] Lambda2B[Lambda: GetForecast] Lambda2C[Lambda: GetAlerts] DDB2[(DynamoDB Table<br/>Global Table - Replica)] end Client -->|DNS Query| R53 R53 -->|Primary Healthy| APIGW1 R53 -.->|Primary Failed| APIGW2 HC1 -.->|Monitor| APIGW1 HC2 -.->|Monitor| APIGW2 APIGW1 --> Lambda1A APIGW1 --> Lambda1B APIGW1 --> Lambda1C Lambda1A --> DDB1 Lambda1B --> DDB1 Lambda1C --> DDB1 APIGW2 --> Lambda2A APIGW2 --> Lambda2B APIGW2 --> Lambda2C Lambda2A --> DDB2 Lambda2B --> DDB2 Lambda2C --> DDB2 DDB1 <-.->|Bi-directional Replication| DDB2 style APIGW1 fill:#FF9900,stroke:#232F3E,stroke-width:2px style APIGW2 fill:#FF9900,stroke:#232F3E,stroke-width:2px style DDB1 fill:#4053D6,stroke:#232F3E,stroke-width:2px style DDB2 fill:#4053D6,stroke:#232F3E,stroke-width:2px style R53 fill:#8C4FFF,stroke:#232F3E,stroke-width:2px

Diagram Note: Under normal operations, Route 53 directs all traffic to us-east-1 based on health check status. Upon primary region failure, DNS automatically resolves to us-west-2, while DynamoDB Global Tables ensure data consistency across both regions.


The Decision Matrix
#

Option Est. Complexity Est. Monthly Cost (10M Requests) Pros Cons
A Medium N/A - Architecturally Invalid ❌ Conceptual error—edge-optimized endpoints can’t route to multi-region Lambda Cannot achieve multi-region failover; fundamental misunderstanding of API Gateway capabilities
B Medium $420/month (dual active API Gateway + Global Tables) ✅ Full regional stack deployment
✅ DynamoDB global replication
❌ Multivalue routing = no automatic failover
❌ 50% traffic still hits failed region
❌ Higher cost due to dual-active API requests
C Medium $240/month (primary active + warm standby) ✅ True automatic DNS failover
✅ Cost-efficient warm standby
✅ Health-check driven
✅ Industry-standard DR pattern
Requires ~60s DNS TTL propagation for failover (acceptable for most DR scenarios)
D High N/A - Architecturally Invalid ❌ “Global Lambda” is not a valid AWS service concept Same multivalue routing issues as B; adds non-existent service dependencies

Cost Breakdown (Option C):

  • API Gateway: $3.50/million requests × 10M = $35 (primary only under normal ops)
  • Lambda: ~$0.20 per 1M requests (128MB, 200ms avg) × 10M = $2
  • Lambda Compute: ~$160/month (assuming 2 billion GB-seconds)
  • DynamoDB Global Tables: ~$25/month (write replication for 100 WCU)
  • Route 53: $0.50/month (hosted zone) + $0.50 (health checks)
  • Data Transfer: ~$10/month (inter-region DynamoDB replication)
  • Secondary Region (Standby): $7 (API Gateway monthly fee + minimal Lambda invocations for health checks)

Total: ~$240/month vs. Option B’s ~$420/month (due to dual active-active API invocations).


Real-World Application (Practitioner Insight)
#

Exam Rule
#

For the SAP-C02 exam, when you see:

  • “Multi-region API failover” + “automatic” → Think Route 53 Failover Routing
  • “Edge-optimized endpoint” → Understand it’s for CloudFront caching, NOT multi-region backend routing
  • “Multivalue answer” → Recognize it’s for simple load distribution, NOT DR failover
  • DynamoDB cross-region DR → Always use Global Tables (bi-directional, automatic)

Real World
#

In production at SkyMetrics-scale companies, we’d layer additional considerations:

  1. Active-Active vs. Active-Passive Decision:

    • If clients are truly global (EU + US), consider geoproximity routing with active-active regions for latency optimization
    • Current solution (failover) is optimized for North America with DR, not global latency
  2. RTO Optimization:

    • Route 53 health checks run every 30s (fast) or 10s (expensive)
    • DNS TTL caching means actual failover = health check interval + TTL (typically 60-90s total)
    • For sub-10s RTO, consider AWS Global Accelerator in front of regional API Gateways (adds ~$0.025/hour + data transfer)
  3. Lambda Cold Start Mitigation:

    • Secondary region Lambda functions will have cold starts during failover
    • Use Provisioned Concurrency (adds ~$15/month per function) for critical endpoints
    • Or accept 1-3s cold start latency as acceptable DR trade-off
  4. Cost Governance:

    • Implement CloudWatch Alarms on secondary region API Gateway invocations
    • Alert if secondary is receiving traffic during non-failover (indicates DNS misconfiguration)
    • Use AWS Cost Anomaly Detection to catch unexpected global table replication costs
  5. Testing Discipline:

    • Schedule quarterly DR drills by failing primary health check manually
    • Test not just failover, but fail-back to primary (often forgotten)
    • Validate DynamoDB global table replication lag under load

The exam tests your knowledge of service capabilities. The real world tests your ability to balance cost, risk, and operational burden within business constraints.


Disclaimer

This is a study note based on simulated scenarios for the AWS SAP-C02 exam. It is not an official question from AWS or any certification body. All company names, scenarios, and technical implementations are fictional and designed for educational purposes.

The DevPro Network: Mission and Founder

A 21-Year Tech Leadership Journey

Jeff Taakey has driven complex systems for over two decades, serving in pivotal roles as an Architect, Technical Director, and startup Co-founder/CTO.

He holds both an MBA degree and a Computer Science Master's degree from an English-speaking university in Hong Kong. His expertise is further backed by multiple international certifications including TOGAF, PMP, ITIL, and AWS SAA.

His experience spans diverse sectors and includes leading large, multidisciplinary teams (up to 86 people). He has also served as a Development Team Lead while cooperating with global teams spanning North America, Europe, and Asia-Pacific. He has spearheaded the design of an industry cloud platform. This work was often conducted within global Fortune 500 environments like IBM, Citi and Panasonic.

Following a recent Master’s degree from an English-speaking university in Hong Kong, he launched this platform to share advanced, practical technical knowledge with the global developer community.


About This Site: AWS.CertDevPro.com


AWS.CertDevPro.com focuses exclusively on mastering the Amazon Web Services ecosystem. We transform raw practice questions into strategic Decision Matrices. Led by Jeff Taakey (MBA & 21-year veteran of IBM/Citi), we provide the exclusive SAA and SAP Master Packs designed to move your cloud expertise from certification-ready to project-ready.