Run traffic at 100M RPS against your architecture design.
A real-time tick engine updates every node and edge on your canvas every 100 milliseconds while traffic runs. Stateful service models track Lambda concurrency and cold starts, Kinesis shards, DynamoDB capacity, and SQS queue depth — before a single resource exists.
Not a calculator. A running system.
Simulation isn't a spreadsheet estimate — it's a live loop that pushes traffic through your actual canvas topology and updates every node while it runs.
A 100ms loop updates every node and edge on the canvas while the simulation runs — load, latency, errors, and cost tick live.
Logarithmic RPS slider from quiet dev environments to global-scale production traffic.
Set direct RPS, or express load as total requests per day, week, or month — Pinpole converts it to steady RPS.
Control the simulation mid-run. Edit traffic configuration while paused, then resume against the new profile.
Per-tick state for Lambda concurrency and cold starts, Kinesis shards, DynamoDB capacity, SQS queue depth, and more.
ELB, Route 53, CloudFront, and Global Accelerator model round-robin and failover routing across your graph.
Model the load your architecture will actually see.
Steady state, organic growth, launch spikes, and daily cycles — each pattern surfaces different failure modes.
Steady-state load
Fixed RPS validates baseline behavior at your expected operating throughput. Size quotas, validate limits, and model baseline cost.
Sizing · quota validation · baseline costOrganic growth
Load increases linearly from zero to target RPS. Models growth and validates auto-scaling trigger points before you commit.
Capacity planning · scaling validationLaunch & viral moments
Instantaneous jump to peak RPS. Stress-tests concurrency limits, queue overflow, and cold-start cascades — the pattern that finds production incidents.
Launch prep · concurrency testingDaily & batch cycles
Oscillating load between baseline and peak. Models payroll windows, Black Friday cycles, and periodic batch workloads.
Auto-scaling · on-demand vs provisionedSee bottlenecks where they happen — on the canvas.
Every node shows load %, sparklines, RPS, latency, error rate, and health status while the simulation runs. Edges animate with live throughput labels. Alerts fire with actionable suggestions.
Load %, 6-second sparklines, RPS, latency, error rate, and status — healthy, scaling, throttled, or error.
Critical throttling alerts and high-load or elevated-error warnings, each with actionable suggestions.
P50 / P95 / P99 computed across the whole graph, not just individual services.
Real cost under real load — not calculator guesses.
The cost model runs inside the simulation, so your estimate reflects your actual traffic pattern, not a vendor calculator's steady-state assumption.
Live $/sec, $/hour, $/day, and $/month projections update in the Simulation panel as the run progresses.
Expandable list sorted by load — each node shows cost per second and service-specific metrics.
Dedicated cost handlers for Lambda, API Gateway, DynamoDB, SQS, SNS, S3, ElastiCache, CloudFront, ELB, Kinesis, and more.
Token and unit pricing for foundation models and AgentCore services — AI workload costs modelled like any other service.
Derived from your current RPS so the dollar figure always has request-volume context.
Spike and wave patterns produce different bills than constant load. The estimate reflects the pattern you chose.
Dedicated tick processors, service by service.
Each of these has its own behavioural model in the live engine — capacity, throttling, and pricing. Anything else runs through a generic passthrough model.
Recommendations ranked by severity.
After each simulation, Pinpole analyzes your architecture and returns findings with specific services, problems, and fixes — apply with one click.
Lambda / event-processor — Concurrency limit (1,000) will be exceeded at 1,240 RPS. Recommended: increase reserved concurrency to 8,000.
Firestore / events-collection — Write rate at peak exceeds hot-key limits. Estimated additional cost: $620/mo. Recommended: shard document paths.
Azure Functions / async-handler — Cold-start latency may affect p99 by ~340ms at peak. Consider Premium plan for latency-sensitive paths.
Every recommendation can be applied with one click — Pinpole patches node config or adds new services (an SQS buffer, a CloudFront CDN) and wires the edges for you. See the full Optimization page for how the recommendation engine works.
Every run is evidence you can share.
Runs are saved with traffic config, duration, peak RPS, cost breakdown, node metrics, and recommendations.
Your last 50 runs are mirrored to the browser, so recent evidence is always one click away.
Share a tokenized interactive simulation canvas via URL — no login needed to view. Used by MCP and external tools.
AI agents can run batch simulations through the API and return an inline simulation canvas directly in chat.
Capture the animated simulation as a GIF for demos, docs, and architecture review decks.
Each run stores the exact canvas state, so you can compare versions and roll back regressions.
Cost forecasting
Live monthly cost estimates update as simulation runs — per service, under load, not from a static calculator.
Failure analysis
Find timeout chains, throttling limits, and queue saturation before your users do.
Execution history
Every simulation is logged with a full architecture snapshot. Compare runs and share evidence in architecture reviews.
Stop guessing.
Start simulating.
No infrastructure spend · No cloud account required to simulate