"We're multi-AZ" is one of those reassuring lines that engineering leaders use in customer security reviews, in board meetings, and in their own heads — sometimes without anyone on the team having actually tested what happens when an AZ fails. Multi-region is the next escalation, often invoked the same way. Both come with a cost, both with a complexity tax, and one of them is honest about which one you're getting.
The three HA postures
- Single-AZ. One availability zone. Cheapest. Lose the AZ, lose the service. Acceptable for non-critical workloads.
- Multi-AZ active-active. Capacity in 2–3 AZs in a single region. Tolerates AZ-level outages. Most common production posture.
- Multi-region. Capacity in 2+ regions, typically active-passive or active-active behind global routing. Tolerates region-level outages, including regional control-plane events.
Cost across three postures
Same workload: 20-instance ECS Fargate fleet, RDS Aurora 100 GB, 2 TB monthly data transfer. From a pinpole canvas simulation:
| Posture | Compute | Database | Cross-AZ/region xfer | Total |
|---|---|---|---|---|
| Single-AZ | $1,400 | $280 | $0 | $1,680/mo |
| Multi-AZ (2) | $1,400 | $560 (MAZ Aurora) | $640 (cross-AZ replicas) | $2,600/mo |
| Multi-AZ (3) | $1,400 | $840 | $1,100 | $3,340/mo |
| Multi-region active-passive | $2,800 (warm) | $840 (cross-region replica) | $1,400 + $1,400 inter-region | $6,440/mo |
| Multi-region active-active | $2,800 | $1,400 | $2,800 | $7,000/mo |
The number that surprises people is the cross-AZ data transfer. In dense microservice architectures, every call between services in different AZs costs $0.01/GB each direction. A 2 TB/month service mesh easily routes 6+ TB cross-AZ if you aren't careful.
Cross-AZ data transfer: the silent killer
What teams don't notice
A 50-service microservice mesh with random AZ placement routes ~70% of inter-service calls cross-AZ by default. Each direction bills. Compounds with retries.
The fix
AZ-aware service discovery (CloudMap, Istio locality routing). Pin chatty services to the same AZ. Accept slightly weaker AZ-failure isolation in exchange for the bill.
The honest assessment
For many B2B SaaS workloads, well-architected Multi-AZ is enough. Multi-region is a five-figure-per-month line item for an availability requirement most products don't actually have.
When each posture is actually right
- Single-AZ — dev, staging, batch, anywhere RTO/RPO of "a few hours" is fine.
- Multi-AZ — production for almost everything. Tolerates the failure modes that actually happen in AWS.
- Multi-region — regulated industries with RTO <30 minutes, global products serving multiple geographies, or genuinely critical infrastructure. Not "we want to be safe."
When did your team last actually exercise an AZ failover? Multi-AZ that has never been tested is a configuration claim, not a reliability property. If you've paid for HA for two years without testing it, you've bought an expensive label.
Simulating HA cost on pinpole
The pinpole canvas models AZ and region placement explicitly. Drag services into AZs, configure replicas, and the cost simulator surfaces the per-AZ data-transfer cost alongside compute and database. Toggle to multi-region and see the inter-region replication line appear. Most teams discover their actual HA cost in the first run.
HA you don't test is a line on the invoice, not a property of the system.
Simulate Multi-AZ and Multi-Region postures on the canvas. See the cross-AZ cost surface before you commit.
Start 14-day free trial →