Recommendations engine - part of the PinPole optimize workflow

Architecture warnings, fixed before
you spend a dollar on AWS.

After a simulation run, PinPole's AI engine analyses your architecture and returns a prioritised set of findings - missing services, misconfigured concurrency, structural risks - each with a rationale, an expected impact, and a one-click implementation button.

Try it free Read the docs
No AWS account required · 3 AI calls/month on Free · Unlimited on Pro, Team, Enterprise · 14-day free trial - no card required
Enable Provisioned Concurrencyreduce cold starts by 90%
Add CloudFront distributionabsorb cacheable reads at edge
Implement SQS async processingdecouple write path from Lambda
Add circuit breaker patternprevent cascade failures from DynamoDB throttling
Add DynamoDB DAX caching layerreduce read latency by up to 10×
Configure Lambda auto scalingconcurrency limits grow with load
Implement exponential backoffresilient retries on downstream degradation
Enable Provisioned Concurrencyreduce cold starts by 90%
Add CloudFront distributionabsorb cacheable reads at edge
Implement SQS async processingdecouple write path from Lambda
Add circuit breaker patternprevent cascade failures from DynamoDB throttling
Add DynamoDB DAX caching layerreduce read latency by up to 10×
Configure Lambda auto scalingconcurrency limits grow with load
Implement exponential backoffresilient retries on downstream degradation

Run a simulation.
Get Recommendations.
Refresh as you iterate.

The AI engine reads your architecture canvas and the live simulation results together. It doesn't scan a static diagram - it sees what actually happened under load and surfaces findings specific to your services, your configuration, and your traffic pattern.

01
Run your simulation
Set your target RPS and traffic pattern - Constant, Ramp, Spike, or Wave - and start the simulation. The panel shows live per-node health, current RPS, elapsed time, and a running monthly cost estimate. Let it stabilise before moving to recommendations.
02
Select Get Recommendations
Hit Get Recommendations in the simulation panel. The engine analyses your canvas and simulation results and returns a ranked set of findings - typically 4–8 items for a new architecture. Each finding includes a severity level, a category, a full rationale, and an expected impact.
03
Expand, read, implement
Expand any recommendation to read the full rationale before accepting. When ready, click Implement - the change is applied to the canvas immediately. New services are added and wired. Configuration changes are applied to the relevant node.
04
Re-simulate and Refresh Recommendations
After implementing a change, re-run the simulation to confirm the expected effect on latency, cost, and health. Then use Refresh Recommendations to re-analyse the updated architecture. New findings may emerge that were not visible before - the engine works against the current state.
Simulation - 1.0K RPS · Constant
0.5s elapsed
Current RPS
1.0K
Est. Cost
$12,029
/mo
Node Metrics
api-gateway healthy  1,021 RPS · 10ms
cloudfront healthy  1,042 RPS · 16.7ms
lambda healthy  0 RPS · 156.3ms
ec2 healthy  0 RPS · 10ms
rds healthy  0 RPS · 10ms
Get Recommendations

Prioritised findings.
One-click implementation.

Every recommendation is categorised, severity-ranked, and expandable. Read the full rationale, review the expected impact, then implement directly from the panel - no manual canvas edits required.

Refresh Recommendations (6)
Recommendations 6
⚠ WARNING modify config
Enable Provisioned Concurrency for Request Processor Lambda

The Request Processor Lambda experiences a significant number of cold starts (1,488) which increases latency to 156.3ms. Enabling provisioned concurrency will keep a set number of Lambda instances initialised and ready to respond, reducing cold start latency and improving response times during traffic spikes.

Impact
Reduces latency and cold start delays for Request Processor Lambda, improving overall API responsiveness during spikes.
Expected
Reduce latency by up to 50%, decrease cold starts by 90%
+ 2 more recommendations

Recommendations are returned in severity order - address WARNINGs before INFOs. Each card shows enough context to act, with the full rationale one expand away. The Implement button applies the change immediately: services are added, wires are drawn, configuration is updated.

modify config
Configuration adjustments
Concurrency settings, memory allocation, caching TTLs, timeouts, SQS visibility windows - changes to existing node configuration.
add service
Missing services
Services that would materially improve performance, resilience, or cost. CloudFront for edge caching, DAX for DynamoDB reads, SQS for write path decoupling.
architecture
Structural changes
Circuit breakers, async decoupling, retry patterns, fan-out topology - changes to how services connect and communicate.
scaling
Auto-scaling gaps
Missing or misconfigured auto-scaling policies that will cause services to hit hard concurrency or throughput limits under increasing load.

WARNING first.
Then INFO.

Every finding is assigned a severity level that signals urgency. The ordering is deliberate - address WARNINGs before INFOs, and re-simulate between each change.

WARNING

A configuration that represents a real failure risk or significant cost inefficiency at the simulated load. These findings require action before deployment - they are not advisory.

Examples:
Lambda throttling at target RPS
API Gateway as single point of failure
Missing CloudFront on high-read API
Lambda concurrency limit will be exceeded
INFO

An improvement opportunity that is not blocking but should be addressed before production - these findings represent meaningful performance or cost gains that are not urgent at current load.

Examples:
Lambda provisioned concurrency not set
DynamoDB on-demand mode suboptimal
SQS visibility timeout too short
No exponential backoff on retry paths
Apply one recommendation at a time. Batch-applying all recommendations in one pass obscures which change had the most impact. Apply WARNING items, re-simulate, then review INFO items in the new state. Refreshing recommendations after each re-simulation ensures the engine is always analysing the current architecture.
Typical Lambda API optimization sequence
01
Add CloudFront WARNING
Reduces API Gateway load and p99 latency. Absorbs cacheable reads at edge before they reach Lambda.
02
Introduce Circuit Breaker WARNING
Prevents Lambda cascade failures when downstream services (DynamoDB, RDS) degrade under load.
03
Enable Provisioned Concurrency INFO
Pre-initialises Lambda execution environments, eliminating cold start latency under spike traffic.
04
Implement SQS Async Processing INFO
Decouples the write path. API Gateway accepts requests into the queue; Lambda processes at its own pace.
05
Configure Auto Scaling INFO
Ensures concurrency limits scale with load. Required for architectures serving variable or unpredictable traffic.

Read the rationale.
Click Implement.
Re-simulate.

No manual canvas edits. No digging through documentation to figure out how to wire a new service. The Implement button applies the recommendation directly - new services appear on the canvas, connections are drawn, and affected node configuration is updated.

What Implement does

The canvas updates
automatically.

Each recommendation maps to a specific, reversible change on the canvas. Implement applies it immediately - the change is visible in the canvas, and the simulation state is reset so your next run reflects the updated architecture.

+ Add service recommendations - new service node appears on the canvas, wired to the correct upstream and downstream services according to the recommended topology.
Modify config recommendations - the relevant node's configuration is updated in place. Open the Node Configuration panel to review the applied change.
Architecture recommendations - structural changes are applied: new connection types are added, existing wires may be re-routed, and protective patterns (circuit breakers, retry logic) are configured on the affected nodes.
After implementing any recommendation, re-run the simulation before accepting the next one. Confirm the expected effect on latency, cost, and health - then use Refresh Recommendations to re-analyse the updated state.
⚠ WARNING add service
Add CloudFront distribution in front of API Gateway
API Gateway is receiving the full 1,041 RPS directly. A CloudFront distribution at the edge will absorb cacheable requests, reduce API Gateway load, and lower end-user latency through edge PoP routing.
Expected: Reduce API Gateway load by ~60%, latency by 40ms at edge
Implement
↓  canvas updated
Applied to canvas
CloudFront node added to canvas
Wired: CloudFront → API Gateway
Latency routing tag applied to node
PriceClass_100 configured
New service added & wired
For add service recommendations, the new node appears on the canvas connected to the correct upstream and downstream services. No manual wiring required.
Node configuration updated
For modify config recommendations, the affected node's properties are updated in place. Open Node Configuration to inspect the change before re-running.
Architecture restructured
For architecture recommendations, connections are re-routed, protective patterns are applied, and async paths are introduced - exactly as specified in the recommendation rationale.

Get more from
every recommendation cycle.

A few habits that make the optimize loop measurably more effective - drawn from the recommendation patterns that emerge most frequently in real simulation sessions.

01 -
One recommendation at a time
Apply a single recommendation, re-simulate, then review the next. Batch-applying all recommendations obscures which change had the most impact, making future debugging harder and the iteration history less useful.
02 -
Read the rationale before implementing
Expand each recommendation and read the full rationale. Recommendations are contextual - understanding why a change is being proposed helps you assess whether the trade-off is appropriate for your specific use case and traffic profile.
03 -
Use Refresh Recommendations after canvas changes
The engine analyses the current state of your canvas. After significant changes - whether from implementing a recommendation or making manual edits - use Refresh Recommendations to ensure the next set of findings reflects the updated architecture.
04 -
Dismiss irrelevant recommendations explicitly
If a recommendation doesn't apply to your use case, dismiss it rather than leaving it open. This keeps your recommendation history accurate and ensures future Refresh calls surface only relevant findings - not repeated noise from items you've already considered.
05 -
Run clean simulations after Lambda changes
After changing Lambda concurrency settings, stop the simulation fully and start a new run rather than resuming. Concurrency changes are applied at simulation initialisation - resuming a paused run will not pick up the new values, making the subsequent recommendation analysis unreliable.
06 -
Do not skip recommendations on a zero-alert run
A clean simulation with no WARNING alerts confirms the architecture handles load without breaching limits. It does not mean the architecture is optimized. Proceed to Recommendations even when alerts are zero - INFO-level findings often surface meaningful cost and latency improvements that aren't visible in the health indicators.
Plan availability

Recommendations are available on all plans. Free includes 3 AI calls per month. Pro, Team, and Enterprise include unlimited calls. Additional AI credits are available as add-ons on any paid plan.

3 calls / mo - Free
calls / mo - Pro+
$0.03 per call add-on

Your next architecture
already has warnings.
Find them first.

Design on the canvas, run a simulation, and get recommendations before a dollar is spent on AWS. Every finding comes with a one-click fix.

No AWS account required to simulate · 14-day free trial on all paid plans · Cancel anytime