tool
Apache Kafka
Apache Kafka
Kafka is a log. Not a queue, not a broker — a distributed, append-only commit log that consumers read at their own pace and can rewind.
The model
- Producers append messages to topics, partitioned for parallelism.
- Messages are written to an append-only log and stay there for the retention window (7 days, 30 days, forever — configurable).
- Consumers track their own offset (position in the log). Any consumer at any time can rewind.
- The log is the history. New consumers can join later and replay everything.
When to reach for Kafka
- Event sourcing — the log is the source of truth.
- Stream processing — Kafka Streams, ksqlDB, Flink integrations.
- Inter-team data pipelines — Team A produces; Team B, C, D each consume independently with their own offsets.
- Replay-as-a-feature — ship a new fraud-detection service Tuesday, reset its offset to 30 days ago Wednesday, let it catch up on a month of history by lunchtime. No re-emission needed.
- High throughput — millions of messages per second at the high end.
The catch
Kafka has real operational weight:
- A full Kafka cluster (broker, ZooKeeper or KRaft, schema registry, monitoring) is a system to operate.
- Running it for 3,500 messages a day isn’t an architecture decision — it’s a resume decision.
- You pay for it every time someone has to learn it, tune it, or debug it at 4 AM.
Hosted alternatives — Confluent Cloud, AWS MSK, Redpanda Cloud — remove most of the ops pain but the conceptual surface area (partitions, consumer groups, offsets, retention) stays.
Recent update (2025)
Kafka 4.0 added share groups, which give you queue-style consumption natively. So the old “Kafka can’t do queues” line is out of date — but the log is still the reason to reach for it. If you don’t need replay, you probably don’t need Kafka.
Delivery semantics
Kafka advertises “exactly-once semantics” but with a big asterisk: that guarantee only covers what happens inside the Kafka cluster (producer-to-broker + transactional writes across topics). The moment your consumer writes to an external database or calls an external API, you’re back to idempotency|idempotent processing at the consumer side. See delivery-guarantees.
When NOT to pick Kafka
- You don’t need replay → use sqs or rabbitmq.
- You need rich per-message routing → use rabbitmq.
- You don’t want to run a cluster → use sqs or a managed RabbitMQ.
See also
- rabbitmq — broker with rich routing, no log
- sqs — managed queue, no replay
- delivery-guarantees — the “exactly-once” asterisk
- idempotency — required at the consumer-to-external-system boundary
- back-pressure — Kafka’s consumer-pull model is inherently back-pressured but partition lag still needs alerting
- bullmq, pgmq — what kula actually uses; Kafka is heavy for personal-scale projects