Jeff’s Note #
Unlike generic exam dumps, ADH analyzes this scenario through the lens of a Real-World Lead Developer.
For AWS DVA-C02 candidates, the confusion often lies in how to effectively monitor and alert on custom application-level metrics within serverless architectures. In production, this is about knowing exactly how to instrument Lambda functions to track and signal error thresholds without excessive latency or operational overhead. Let’s drill down.
The Certification Drill (Simulated Question) #
Scenario #
NeoRetail, an online retail startup, built a serverless order fulfillment application using AWS Lambda. Their Lambda function continuously processes customer orders and calls a third-party payment service API to complete transactions. During stress testing, developers noticed that this external payment API occasionally responds with timeouts or errors. NeoRetail expects that some payment attempts will fail intermittently.
The operations team requires near real-time alerts—only if the payment API error rate exceeds 5% of total payment attempts within any one-hour window. The team also wants to reuse an existing Amazon SNS topic already configured to notify support engineers.
The Requirement: #
Design an efficient and scalable solution to measure payment API failures, calculate error rate hourly, and trigger alerts through the existing SNS topic only when errors surpass 5%.
The Options #
-
A) Log payment API call results from Lambda into CloudWatch Logs. Use CloudWatch Logs Insights queries to analyze logs. Schedule Lambda to periodically check logs and publish notifications to the existing SNS topic when errors exceed threshold.
-
B) Emit custom CloudWatch metrics from the Lambda function that count payment API failures. Set up a CloudWatch alarm on this metric to send notification to the existing SNS topic when the error rate crosses 5%.
-
C) Publish all payment API call results to a new SNS topic. Subscribe support team members directly to this new topic for failure notifications.
-
D) Store API call outcomes in Amazon S3. Schedule Amazon Athena queries to calculate hourly error rate. Configure Athena to send alerts to the existing SNS topic when error percentage exceeds 5%.
Google adsense #
leave a comment:
Correct Answer #
B
Quick Insight: The Developer Monitoring Imperative #
The key differentiator here is the use of native, custom CloudWatch metrics to track errors in near real-time, combined with CloudWatch alarms directly integrated with SNS for alerting. This approach ensures lower latency and operational overhead compared to relying on logs or batch analytics systems.
Content Locked: The Expert Analysis #
You’ve identified the answer. But do you know the implementation details that separate a Junior from a Senior?
The Expert’s Analysis #
Correct Answer #
Option B
The Winning Logic #
Publishing custom CloudWatch metrics from the Lambda function to track payment API call failures/attempts enables real-time aggregation of error counts and total transactions. CloudWatch Alarms can directly compare these metrics (using metric math to calculate error rates) and send immediate notifications via the existing SNS topic once the 5% error threshold is breached.
- This approach leverages AWS native monitoring tools designed for low latency alerting.
- Instrumentation within Lambda ensures metrics are recorded per invocation.
- CloudWatch Alarms automatically handle evaluation periods and thresholds.
- Direct SNS integration avoids adding extra polling layers or batch processing delays.
The Trap (Distractor Analysis): #
-
Why not A?
While CloudWatch Logs Insights is powerful for ad-hoc queries, scheduling a Lambda to query logs periodically adds latency and complexity. It’s not ideal for near real-time alerting. -
Why not C?
Creating a separate SNS topic to publish all results floods support with raw data rather than aggregated error rate alerts. It does not fulfill the threshold requirement and duplicates resources unnecessarily. -
Why not D?
Storing logs in S3 and using Athena queries introduces delays inherent in batch queries and complicates alerting setup. This does not meet the requirement for near real-time notifications.
The Technical Blueprint #
B) For Developer (Code Snippet to Publish Custom Metrics) #
const AWS = require('aws-sdk');
const cloudwatch = new AWS.CloudWatch();
exports.handler = async (event) => {
const paymentSuccess = await callPaymentAPI(event);
const metricData = [{
MetricName: 'PaymentApiErrors',
Dimensions: [
{ Name: 'FunctionName', Value: process.env.AWS_LAMBDA_FUNCTION_NAME },
],
Unit: 'Count',
Value: paymentSuccess ? 0 : 1,
}, {
MetricName: 'PaymentApiCalls',
Dimensions: [
{ Name: 'FunctionName', Value: process.env.AWS_LAMBDA_FUNCTION_NAME },
],
Unit: 'Count',
Value: 1,
}];
await cloudwatch.putMetricData({
Namespace: 'NeoRetail/Payment',
MetricData: metricData,
}).promise();
if(!paymentSuccess) {
// additional error handling
}
};
The Comparative Analysis (Developer-Focused) #
| Option | API Complexity | Performance | Use Case |
|---|---|---|---|
| A | Moderate (CloudWatch Logs Insights APIs) | Higher latency due to periodic polling | Ad-hoc log analysis, not ideal for near real-time alerts |
| B | Low (CloudWatch PutMetricData API) | Low latency, near real-time metrics | Best for real-time monitoring and alerting |
| C | Low (SNS Publish API) | High noise, no aggregation | Raw notification, no error rate calculation |
| D | High (Athena Queries, S3 Operations) | High latency, batch processing | Post-mortem analytics, not real-time |
Real-World Application (Practitioner Insight) #
Exam Rule #
“For the exam, always pick CloudWatch custom metrics + alarms when you see threshold-based monitoring and alerting requirements tied to Lambda function outcomes.”
Real World #
“In production, additional refinements may include metric math expressions in CloudWatch alarms for calculating percentages, and integrating AWS X-Ray for deeper tracing on failure reasons.”
(CTA) Stop Guessing, Start Mastering #
Disclaimer
This is a study note based on simulated scenarios for the AWS DVA-C02 exam.