Simulation · Stress testing

What happens when your app goes viral — simulated

PinPole Engineering 6 min read April 2026

Your app gets featured on Hacker News. Or Product Hunt. Or someone with 2 million followers tweets about it. Traffic goes from 100 requests per second to 100,000 in under an hour.

Your architecture was designed for 100 RPS. Maybe 1,000 on a good day. Not 100,000. What breaks?

We simulated this exact scenario using PinPole's spike traffic pattern against a standard serverless architecture. Here's the timeline of what happens as traffic ramps from 100 to 100,000 RPS.

The architecture under test

API Gateway, Lambda (512MB, 10s timeout, 1,000 concurrency), DynamoDB (on-demand), SQS, and a second Lambda for async processing. Default configurations. The architecture most startups ship with.

The failure timeline

100 RPS
Everything works. Lambda concurrency at 30/1,000. DynamoDB at 2% capacity. Cost: $47/month. This is where you live normally.
1,000 RPS
Still fine. Lambda concurrency at 300/1,000. DynamoDB handling it smoothly on on-demand. Cost jumps to $340/month. Your scaling looks good.
3,500 RPS
First warning. Lambda concurrency at 1,050/1,000. You've exceeded the limit. Throttling begins. 5% of requests are rejected. API Gateway returns 429 errors. Your users see loading spinners.
10,000 RPS
Lambda is overwhelmed. 3,000 concurrent executions needed, only 1,000 allowed. 67% of requests throttled. API Gateway timeout errors cascade. SQS queue depth at 45,000 and growing. The async Lambda can't keep up with the backlog.
50,000 RPS
DynamoDB on-demand throttles. Even on-demand has burst limits. Write throttling on hot partition keys. The errors compound: Lambda retries push concurrency higher, which causes more throttling, which causes more retries.
100,000 RPS
Complete cascade failure. API Gateway returning 5xx on 89% of requests. SQS queue depth at 2.3 million. Lambda throttled at 90%. Estimated monthly cost at this rate: $28,400. Your viral moment is a production outage.

The before and after

Before fixes
89% errors
After fixes
0.2% errors
SQS backlog (before)
2.3M
SQS backlog (after)
340

Five fixes that make it survivable

PinPole's recommendation engine flagged five changes. We applied each one and re-simulated:

1

Increase Lambda concurrency to 30,000

Request a concurrency limit increase from AWS. This is a soft limit and typically approved within hours. Enable provisioned concurrency for the first 5,000 to eliminate cold starts under spike.

2

Add API Gateway throttling at 80,000 RPS

Set a throttle limit slightly below your backend's capacity. Better to return 429 (rate limited) to 20% of users than 500 (server error) to 89%. Rate limiting is graceful degradation.

3

Enable DynamoDB DAX caching

Add a DAX cluster for read-heavy access patterns. This reduces DynamoDB read load by 80%+ and eliminates hot-partition throttling for reads.

4

Add CloudFront in front of API Gateway

Cache static and semi-static responses. At 100k RPS, even a 10-second cache TTL reduces origin load dramatically.

5

Increase SQS Lambda batch size and concurrency

The async processor was falling behind because it was processing one message at a time. Batch size of 10 with reserved concurrency of 5,000 clears the backlog in minutes instead of hours.

The lesson: Your architecture doesn't need to handle 100,000 RPS today. But you need to know exactly what will break when it does, and have a plan to fix each breaking point. PinPole gives you that plan before the viral moment arrives.

Simulate your viral moment

Build your architecture. Run a spike simulation at 100x your current traffic. Find the breaking points. Apply the fixes. Re-simulate. Know that when the moment comes, your infrastructure will survive it.

Find your breaking points before your users do

Spike simulation from 10 RPS to 100M RPS. No deployed infrastructure needed.

Start for free →