Event-Driven Streaming Engineering · April 2026 · 9 min read AWS

SQS vs Kinesis vs EventBridge vs MSK: simulating event-driven architectures at 100K and 1M events/sec

Staff Engineer, Platform Real-time analytics SaaS April 2026
← Back to blog

Pick the wrong messaging system for an event-driven workload and you do not find out at design review. You find out three months in, at 02:00, when the cost line on the bill quietly doubles or a downstream consumer falls hopelessly behind. SQS, Kinesis, EventBridge and MSK all do something that looks superficially similar — but at 100K events per second their economics, ordering guarantees, and operational profiles diverge sharply.

What each tool is actually for

Cost at 100K and 1M events/sec

From a pinpole canvas simulation with 1 KB average event size, single-consumer-group fanout, 7-day retention where applicable:

Service100K events/sec1M events/secOrderingReplay
SQS Standard~$1,040/mo~$10,400/moNoneNo
Kinesis (on-demand)~$2,300/mo~$23,000/moPer-shardYes
EventBridge~$2,600/mo~$26,000/moNone7 day archive
MSK (m7g.large × 3)~$680/mo~$3,400/moPer-partitionYes

MSK looks dramatically cheaper at high volume, and on raw infrastructure it is. The gap closes once you factor in the operational tax: someone has to actually run Kafka. For most teams at 100K events/sec, SQS or Kinesis is still the right answer.

When each one actually wins

SQS wins

Work distribution between services, retry-heavy workloads, idempotent consumers, no ordering requirement. Cheapest at low-to-medium throughput, zero ops.

Kinesis wins

Real-time analytics, CDC pipelines, anything that needs replay or fan-out to multiple consumer groups, ordered per-key.

EventBridge wins

SaaS event ingest, cross-account orchestration, when you want declarative routing rules without writing code. Not a high-throughput pipe.

MSK wins

Sustained millions of events/sec, existing Kafka skills on the team, advanced semantics (exactly-once, transactional). Highest floor, lowest ceiling per event.

Three expensive traps the simulation catches

  1. Using SQS for streaming. Engineers treat SQS as a stream because it is the default they know. At fan-out time they realise there is no replay, and rebuild on Kinesis at 4× the cost they would have paid up front.
  2. Using EventBridge as a high-throughput bus. EventBridge's per-event cost ($1 per million) looks fine until you hit a million events per minute. Then it is $43k/mo on routing alone.
  3. Under-sharding Kinesis. Provisioned Kinesis with too few shards looks cheap and works fine — until a hot partition causes a single shard to throttle. Pre-deployment spike simulation surfaces this; production discovers it during a launch.
A heuristic

"Do I need to replay this stream in three months?" is the single highest-signal question. If the answer is yes, you want Kinesis or MSK. If no, SQS is almost always the cheaper and simpler answer. EventBridge is a routing layer, not a pipe.

Simulating it yourself

On the pinpole canvas, drop the producer, the messaging service, and the consumer. Set the event rate, average size, retention, and fan-out. The simulator returns per-service throughput, latency through the pipe, and live monthly cost. Swap the messaging node to compare. The "right tool" usually announces itself within three runs.

The wrong messaging layer is one of the most expensive AWS mistakes.

Compare SQS, Kinesis, EventBridge and MSK at your real event rate before you commit a single line of producer code.

Start 14-day free trial →