Jeff’s Note #
Unlike generic exam dumps, ADH analyzes this scenario through the lens of a Real-World Lead Developer.
For DVA-C02 candidates, the confusion often lies in choosing the right AWS streaming service to support multiple parallel consumers reliably. In production, this is about understanding exactly how playback, parallel processing, and data duplication management work with streaming APIs versus queued or event-driven services. Let’s drill down.
The Certification Drill (Simulated Question) #
Scenario #
Zenovia Analytics is developing a data ingestion platform that captures user behavior events from millions of devices in real-time. This data must be processed simultaneously by several application servers running on Amazon EC2 instances to perform different analyses concurrently. If any processing node crashes or needs to restart, it must resume processing without losing any data. Zenovia also plans to add more processors soon, so minimizing duplicated data consumption across these processors is a priority.
The Requirement: #
Which AWS streaming service should Zenovia use to meet these goals: parallel, real-time data processing on multiple EC2 instances, with reliable checkpoint/restart abilities, minimal duplicate data processing, and easy horizontal scaling?
The Options #
- A) Publish the data to Amazon Simple Queue Service (Amazon SQS).
- B) Publish the data to Amazon Kinesis Data Firehose.
- C) Publish the data to Amazon EventBridge.
- D) Publish the data to Amazon Kinesis Data Streams.
Google adsense #
leave a comment:
Correct Answer #
D) Publish the data to Amazon Kinesis Data Streams.
Quick Insight: The Developer Imperative #
- Kinesis Data Streams provide ordered, replayable data consumption with multiple consumers reading in parallel from the same stream shard.
- SQS is a queue with competing consumers model, meaning messages are typically processed by only one consumer, not multiple processors in parallel.
- Firehose focuses on delivery to destination stores, not real-time multi-consumer parallel processing.
- EventBridge is event routing, not for high-throughput stream data processing.
Content Locked: The Expert Analysis #
You’ve identified the answer. But do you know the implementation details that separate a Junior from a Senior?
The Expert’s Analysis #
Correct Answer #
Option D - Amazon Kinesis Data Streams
The Winning Logic #
Kinesis Data Streams supports multiple consumer applications that can read the same data stream independently and at their own pace. Each EC2-based processing application can maintain its own iterator and checkpoint position to resume after interruptions without data loss. This design minimizes data duplication because data records remain in the shards until their retention period expires, allowing replay but only once per consumer checkpoint. This perfectly fits Zenovia’s need for parallel, stateful, real-time stream processing and horizontal scalability.
- As a Lead Developer, you’d use the Kinesis Client Library (KCL) or enhanced fan-out consumers to efficiently manage parallel reads and checkpointing.
- The stream model inherently supports exactly this use case unlike message queues or event buses.
The Trap (Distractor Analysis): #
- Why not A (SQS)?
SQS is designed for point-to-point message consumption with competing consumers. Each message is delivered to only one consumer. It cannot natively support multiple parallel processors consuming the same message independently—leading to data duplication or loss challenges if replicated manually. - Why not B (Kinesis Data Firehose)?
Firehose is a fully managed delivery stream service designed to load data automatically into destinations like S3 or Redshift. It’s not designed for real-time parallel data consumption by multiple custom processing applications. - Why not C (EventBridge)?
EventBridge routes events to targets for event-driven workflows, but it is not optimized for high-throughput continuous data stream scenarios with replay and checkpoint control.
The Technical Blueprint #
# Example of a Kinesis Data Streams consumer app using AWS CLI to list streams
aws kinesis list-streams
# To describe shards in a stream - identify parallelism
aws kinesis describe-stream --stream-name user-behavior-stream
# Typical KCL consumer config enables checkpointing to DynamoDB,
# allowing processors on EC2 instances to resume accurately after restarts.
The Comparative Analysis (Developer Focus) #
| Option | API Complexity | Performance | Use Case |
|---|---|---|---|
| A) SQS | Simple Send/Receive APIs | Good for one-to-one message delivery | Competing consumers, no easy parallel reading |
| B) Firehose | Managed Delivery with no consumer API | High throughput, no consumer read API | ETL/Analytics data delivery to S3/Redshift |
| C) EventBridge | Rich Event Routing API | Highly scalable event routing | Event bus for decoupled event-driven designs |
| D) Kinesis Data Streams | Moderate, requires KCL or SDK | Real-time, ordered, replayable streams | Parallel stream processing with checkpointing |
Real-World Application (Practitioner Insight) #
Exam Rule #
For the exam, always pick Kinesis Data Streams when you see multiple consumers needing reliable replay and checkpointing for real-time streaming data.
Real World #
In production, if you only had one consumer or needed to simplify ingestion to storage, Firehose might be easier. But for multi-consumer parallel processing scenarios with guaranteed checkpointing and minimal duplication, Kinesis Data Streams is the clear choice.
(CTA) Stop Guessing, Start Mastering #
Disclaimer
This is a study note based on simulated scenarios for the DVA-C02 exam.