Every team that has ever survived a Black Friday or a successful product launch has a war story. Every team that hasn't survived one also has a war story. The difference between the two groups isn't talent or budget — it's whether anyone simulated the spike before it arrived. Steady-state load tests catch the wrong class of problem. Spikes catch the right one.
What spike traffic actually breaks
- Lambda regional concurrency limit. Default 1,000 per region. A 5,000-RPS spike on functions with 200 ms average duration hits the wall in seconds.
- API Gateway throttling defaults. 10,000 RPS default per region. Often left untouched. Burst on top of sustained traffic blows through it.
- ALB pre-warming. ALBs scale on a slope of minutes, not seconds. A 10× spike that lands in 30 seconds will see 5xxs before scaling catches up.
- Downstream cascades. The frontend scales; the auth service doesn't. The auth service starts throttling; clients retry; the retry storm consumes capacity that should be serving real traffic.
- Connection pool exhaustion. Lambda or container fleet scales; each new instance opens a connection to RDS; you blow the connection limit and database goes hostile.
Four spike patterns worth simulating
Sale-launch spike
0→10× in <30 seconds, sustained 10× for 15 minutes, then taper. Tests instant-scale capability of every layer.
Email-blast wave
Multiple 3× spikes over an hour as email batches deliver. Tests retry storm tolerance and warm-pool stability.
Geographic cascade
Wave moving from one timezone to the next. Tests regional scaling and connection pool churn.
Influencer flash
0→50× in 60 seconds, sustained 5 minutes, gone. Tests circuit breakers and graceful-degradation paths.
A simulation walkthrough
Take a typical e-commerce checkout: CloudFront → API Gateway → Lambda → DynamoDB + RDS (orders) + Stripe (external). On the pinpole canvas, set the traffic source to Spike at 5,000 RPS peak from a 200 RPS baseline. Run the simulation.
What surfaces:
- Lambda concurrency hits 1,000 within 12 seconds. API Gateway starts returning 429s on overflow.
- DynamoDB on a hot partition shows throttling. The orders write throughput is bottlenecked by a single shard key (likely the customer ID).
- RDS connection count spikes as Lambda scales. The simulator surfaces "approaching connection limit at 850/1000."
- Stripe call latency grows as the external service throttles. Lambda functions hang waiting for the response, consuming concurrency, accelerating the death spiral.
None of this happens in steady-state load tests. All of it happens in production launches.
What you actually fix before the launch
- Request a regional Lambda concurrency limit raise (it's free, takes ~24 hours).
- Pre-warm Provisioned Concurrency to the floor of expected spike. Auto-scale on top.
- Configure API Gateway throttling explicitly. Don't rely on regional defaults.
- Implement RDS Proxy or move Stripe-style external calls behind a queue with explicit DLQ.
- Add timeouts and circuit breakers on every external call. Default Lambda timeout (3 seconds) is rarely the right number for downstream-aware design.
- Pre-warm ALBs by ramping traffic over the 30 minutes before the launch.
No major launch is approved until a Spike simulation passes at 1.5× the expected peak. If it fails, the launch slips. The cost of one delayed launch is small. The cost of one failed launch — in lost revenue, refunds, and reputational damage — is the next quarter's road map.
Simulating spikes on pinpole
The canvas Spike traffic pattern propagates the spike through every node in the architecture and surfaces failure modes per-service. You see exactly where the architecture breaks, in what order, and at what RPS. The AI recommendations engine flags the fix for each. Run, fix, re-run. By the time the launch arrives, every prevented failure is documentation, not a war story.
Black Friday is a deterministic problem. Simulate it before it simulates you.
Spike simulation on the pinpole canvas surfaces the four classes of failure that load tests miss.
Start 14-day free trial →