Real-time data delivery: API, MQTT and streaming patterns
"Real-time" is one of the most over-requested and least-defined requirements in data projects. This guide separates genuine real-time needs from the rest, explains the main delivery patterns, and shows how to design a feed that is reliable as well as fast.
Do you actually need real-time?
Latency should follow the decision. A weekly review does not need a live feed; grid balancing, fraud detection or fleet routing do. Real-time delivery costs more to build and operate, so the first question is what decision the data drives and how quickly it must react.
The main delivery patterns
- Request APIs: the consumer pulls data on demand; simple and well-understood.
- MQTT and lightweight messaging: efficient publish/subscribe for high-volume telemetry and IoT.
- Event streaming: continuous, ordered event delivery for systems that react to changes.
- Webhooks: push notifications on events, a pragmatic middle ground.
Near-real-time vs real-time
True real-time (sub-second to seconds) is rarer than assumed; near-real-time (seconds to minutes) covers most operational needs at much lower cost and complexity. Being explicit about the latency target avoids over-engineering.
Designing a reliable feed
Speed is worthless without reliability. A robust feed needs defined delivery guarantees, ordering and de-duplication where it matters, back-pressure handling, schema versioning, monitoring and alerting, and a replay or backfill path for gaps. These are the difference between a demo and production.
Quality in motion
Streaming data still needs validation. Checks run on the stream rather than on a file, and acceptance criteria cover freshness and completeness over time windows rather than per-delivery.
Sourcing and governance
Source latency caps your delivery latency: you cannot be fresher than the source. Where streams carry personal data, aggregation or anonymisation must happen in the pipeline, and security practices aligned with NIS2 and ISO/IEC 27001 principles apply to live infrastructure.
Choosing a pattern by use case
Match the delivery pattern to how the consumer reacts. A dashboard refreshed each minute is well served by a request API or frequent micro-batches; a fraud or telemetry system that must react to each event needs streaming or MQTT; a system that should be notified of specific changes fits webhooks. The mistake is defaulting to “real-time streaming” for everything, which adds cost and operational burden the use case may not justify.
Operational resilience
Low latency is worthless if the feed is fragile. Production-grade real-time delivery needs defined delivery guarantees, ordering and de-duplication where correctness depends on them, back-pressure handling for load spikes, schema versioning, monitoring and alerting, and a replay path so a consumer can recover after an outage without data loss. These are the difference between a demo and a feed an operational system can depend on.
- Let latency follow the decision; most needs are near-real-time, not real-time.
- Choose the pattern (API, MQTT, streaming, webhooks) to fit the use case.
- Reliability (ordering, replay, monitoring) matters as much as speed.
- You cannot deliver fresher than the source allows.
Sources & further reading
- OASIS: MQTT specification.
- Apache Kafka and CNCF: streaming and event-driven patterns.
- European Commission: The Data Act (real-time access to connected-product data).
- EUR-Lex: Regulation (EU) 2016/679 (GDPR).
API, MQTT or streaming delivery, designed for reliability as well as speed. Get a no-obligation quote.