Skip to main content

AWS DVA-C02 Drill: Lambda Custom Metrics - Precision Throughput Measurement Beyond Native Metrics

Jeff Taakey
Author
Jeff Taakey
21+ Year Enterprise Architect | AWS SAA/SAP & Multi-Cloud Expert.

Jeff’s Note
#

Jeff’s Note
#

“Unlike generic exam dumps, ADH analyzes this scenario through the lens of a Real-World Lead Developer.”

“For DVA-C02 candidates, the confusion often lies in mistaking native Lambda metrics (Invocations, Duration, ConcurrentExecutions) as sufficient for custom business logic throughput. In production, this is about knowing exactly when to instrument your code with PutMetricData API calls versus relying on CloudWatch’s automatic metrics. Let’s drill down.”

The Certification Drill (Simulated Question)
#

Scenario
#

StreamFlow Analytics operates a fleet management platform that processes telemetry data from autonomous delivery drones. An AWS Lambda function receives JSON payloads from AWS IoT Core containing GPS coordinates, battery levels, and flight status updates. The engineering team must prove compliance with a contractual SLA that guarantees processing of 10,000 messages per minute during peak hours.

The challenge: The Lambda function performs several operations that should NOT count toward SLA throughput:

  • Cold start initialization (loading ML models into memory)
  • Connection pooling setup
  • Post-processing audit log writes to S3

The development team needs a near real-time dashboard showing only the core message processing rate—specifically, the count of messages successfully parsed and validated per time window.

The Requirement
#

Implement a monitoring solution that:

  1. Measures throughput based exclusively on message receipt and processing completion
  2. Excludes initialization and post-processing steps from measurement
  3. Provides near real-time visibility (sub-minute granularity)
  4. Enables CloudWatch dashboard creation for SLA tracking

The Options
#

  • A) Use the Lambda function’s ConcurrentExecutions metric in Amazon CloudWatch to measure the throughput.
  • B) Modify the application to log the calculated throughput to Amazon CloudWatch Logs. Use Amazon EventBridge to invoke a separate Lambda function to process the logs on a schedule.
  • C) Modify the application to publish custom Amazon CloudWatch metrics when the Lambda function receives and processes each message. Use the metrics to calculate the throughput.
  • D) Use the Lambda function’s Invocations metric and Duration metric to calculate the throughput in Amazon CloudWatch.

Correct Answer
#

Option C.

Quick Insight: The Developer Control Imperative
#

For DVA-C02: This question tests your understanding of code-level instrumentation. Native Lambda metrics measure infrastructure-level events (invocations, concurrent executions), but they cannot distinguish between your business logic steps. The PutMetricData API gives you surgical precision to emit metrics exactly where you define success in your application code—after message validation, not after cold starts or S3 writes.

Content Locked: The Expert Analysis
#

You’ve identified the answer. But do you know the implementation details that separate a Junior from a Senior?


The Expert’s Analysis
#

Correct Answer
#

Option C: Modify the application to publish custom Amazon CloudWatch metrics when the Lambda function receives and processes each message. Use the metrics to calculate the throughput.

The Winning Logic
#

This solution is correct because:

  1. Surgical Precision: By embedding cloudwatch.put_metric_data() (Python SDK) or PutMetricDataCommand (JavaScript SDK v3) calls after message validation and before post-processing, you create explicit measurement boundaries that align with your SLA definition.

  2. Native CloudWatch Integration: Custom metrics automatically support:

    • Sub-minute resolution (1-second granularity with high-resolution metrics)
    • Math expressions for throughput calculation (SUM(MessagesProcessed) / PERIOD(SECONDS) * 60)
    • Direct alarm integration for SLA breach detection
  3. Developer Control: Unlike passive metrics (Invocations, Duration), you instrument exactly the code paths that matter:

    # Pseudocode placement
    def lambda_handler(event, context):
        # Cold start logic runs here (NOT measured)
    
        for record in event['Records']:
            message = parse_iot_message(record)
            validate_schema(message)
    
            # ✅ EMIT METRIC HERE - after validation, before S3 write
            cloudwatch.put_metric_data(
                Namespace='StreamFlow/IoT',
                MetricData=[{
                    'MetricName': 'MessagesProcessed',
                    'Value': 1,
                    'Unit': 'Count',
                    'Timestamp': datetime.utcnow()
                }]
            )
    
        write_audit_logs_to_s3()  # Post-processing (NOT measured)
    
  4. Near Real-Time: With 1-second resolution custom metrics, CloudWatch dashboards update within seconds, not minutes.

The Trap (Distractor Analysis)
#

Why not Option A (ConcurrentExecutions)?

  • Measures parallelism, not throughput: This metric shows how many function instances are running simultaneously, but tells you nothing about message count. A single execution could process 1 message or 1,000 messages in a batch.
  • No correlation to business logic: High concurrency during cold starts would falsely inflate your metric when no messages are actually being processed.

Why not Option B (Logs + EventBridge)?

  • Unacceptable latency: CloudWatch Logs Insights queries or EventBridge scheduled rules introduce 1-5 minute delays, violating the “near real-time” requirement.
  • Over-engineered: Parsing logs with a separate Lambda function adds cost, complexity, and failure points. This is the “duct tape” solution when you don’t know about PutMetricData.
  • Hidden costs: Log ingestion charges + EventBridge invocation costs + secondary Lambda costs exceed the $0.01 per 1,000 custom metric PUT requests.

Why not Option D (Invocations + Duration)?

  • False arithmetic: Invocations / Duration gives you invocations per second, not messages processed. If you’re using SQS batch processing (10 messages per invocation), this would undercount by 10x.
  • Includes what you’re trying to exclude: Duration includes cold starts and post-processing. You’d need complex math to subtract initialization time—which varies per cold start.

The Technical Blueprint
#

Python SDK v3 (Boto3) Implementation
#

import boto3
import json
from datetime import datetime

cloudwatch = boto3.client('cloudwatch')

def lambda_handler(event, context):
    # Initialization (cold start - NOT measured)
    # load_ml_model() would run here on cold start
    
    processed_count = 0
    
    for record in event['Records']:
        try:
            # Parse IoT message
            payload = json.loads(record['body'])
            
            # Core business logic
            if validate_telemetry(payload):
                processed_count += 1
                
        except Exception as e:
            print(f"Failed to process message: {e}")
            continue
    
    # ✅ Emit custom metric after processing batch
    if processed_count > 0:
        cloudwatch.put_metric_data(
            Namespace='StreamFlow/IoT',
            MetricData=[
                {
                    'MetricName': 'MessagesProcessed',
                    'Value': processed_count,
                    'Unit': 'Count',
                    'Timestamp': datetime.utcnow(),
                    'Dimensions': [
                        {'Name': 'FunctionName', 'Value': context.function_name},
                        {'Name': 'Environment', 'Value': 'Production'}
                    ],
                    'StorageResolution': 1  # High-resolution (1-second granularity)
                }
            ]
        )
    
    # Post-processing (NOT measured)
    # write_audit_to_s3() would run here
    
    return {'statusCode': 200, 'processedCount': processed_count}

def validate_telemetry(payload):
    # Business validation logic
    required_fields = ['device_id', 'gps_lat', 'gps_lon', 'battery_level']
    return all(field in payload for field in required_fields)

JavaScript SDK v3 Implementation
#

import { CloudWatchClient, PutMetricDataCommand } from "@aws-sdk/client-cloudwatch";

const cloudwatch = new CloudWatchClient({ region: "us-east-1" });

export const handler = async (event, context) => {
    let processedCount = 0;
    
    for (const record of event.Records) {
        try {
            const payload = JSON.parse(record.body);
            
            if (validateTelemetry(payload)) {
                processedCount++;
            }
        } catch (error) {
            console.error('Processing error:', error);
        }
    }
    
    if (processedCount > 0) {
        const command = new PutMetricDataCommand({
            Namespace: 'StreamFlow/IoT',
            MetricData: [
                {
                    MetricName: 'MessagesProcessed',
                    Value: processedCount,
                    Unit: 'Count',
                    Timestamp: new Date(),
                    Dimensions: [
                        { Name: 'FunctionName', Value: context.functionName },
                        { Name: 'Environment', Value: 'Production' }
                    ],
                    StorageResolution: 1
                }
            ]
        });
        
        await cloudwatch.send(command);
    }
    
    return { statusCode: 200, processedCount };
};

function validateTelemetry(payload) {
    const requiredFields = ['device_id', 'gps_lat', 'gps_lon', 'battery_level'];
    return requiredFields.every(field => field in payload);
}

CloudWatch Dashboard Query
#

{
    "metrics": [
        [ { "expression": "SUM(m1) / PERIOD(m1) * 60", "label": "Messages/Minute", "id": "e1" } ],
        [ "StreamFlow/IoT", "MessagesProcessed", { "id": "m1", "stat": "Sum", "visible": false } ]
    ],
    "period": 60,
    "stat": "Average",
    "region": "us-east-1",
    "title": "IoT Message Processing Throughput"
}

The Comparative Analysis
#

Option API Complexity Latency Precision Cost Use Case
A) ConcurrentExecutions None (native metric) Real-time ❌ Measures parallelism, not messages Free Scaling/throttling analysis
B) Logs + EventBridge High (log parsing + Lambda) 1-5 minutes ⚠️ Depends on log parsing accuracy High (3 services) Legacy systems without SDK access
C) Custom Metrics (PutMetricData) Low (single API call) 1-60 seconds ✅ Exact business logic boundaries $0.01/1K metrics Precise SLA/business metric tracking
D) Invocations + Duration None (native metrics) Real-time ❌ No message count correlation Free Infrastructure-level monitoring

Key Differentiators:

  • Precision: Only Option C measures exactly what your SLA defines (processed messages)
  • Latency: Options A/C/D are real-time, but only C is meaningful for throughput
  • Cost: Custom metrics cost ~$0.30/month for 30K messages (negligible vs. SLA breach penalties)

Real-World Application (Practitioner Insight)
#

Exam Rule
#

“For DVA-C02, when you see requirements to measure business-specific metrics (orders processed, messages validated, API calls succeeded) while excluding infrastructure overhead (cold starts, retries, post-processing), always choose custom CloudWatch metrics with PutMetricData. Native Lambda metrics measure Lambda as a service, not your application logic.”

Real World
#

“In production, we combine both approaches:

  • Native metrics (Invocations, Errors, Duration) for infrastructure alarms (throttling, memory issues)
  • Custom metrics (MessagesProcessed, ValidationFailures) for business SLA dashboards

Pro tip: Use EMF (Embedded Metric Format) to emit custom metrics via structured CloudWatch Logs—it’s more cost-effective at scale (avoids PutMetricData API limits of 150 TPS per region):

import json

def lambda_handler(event, context):
    processed = process_messages(event)
    
    # EMF format - automatically creates metrics from logs
    print(json.dumps({
        "_aws": {
            "Timestamp": int(datetime.now().timestamp() * 1000),
            "CloudWatchMetrics": [{
                "Namespace": "StreamFlow/IoT",
                "Dimensions": [["FunctionName"]],
                "Metrics": [{"Name": "MessagesProcessed", "Unit": "Count"}]
            }]
        },
        "FunctionName": context.function_name,
        "MessagesProcessed": processed
    }))

This hybrid approach costs $0.50/GB logs vs. $0.01/1K PUT requests, making it cheaper at >50K metrics/hour.”


Stop Guessing, Start Mastering
#


Disclaimer

This is a study note based on simulated scenarios for the DVA-C02 exam. The company “StreamFlow Analytics” and specific implementation details are fictional examples created for educational purposes. Always refer to official AWS documentation and your organization’s compliance requirements for production implementations.

The DevPro Network: Mission and Founder

A 21-Year Tech Leadership Journey

Jeff Taakey has driven complex systems for over two decades, serving in pivotal roles as an Architect, Technical Director, and startup Co-founder/CTO.

He holds both an MBA degree and a Computer Science Master's degree from an English-speaking university in Hong Kong. His expertise is further backed by multiple international certifications including TOGAF, PMP, ITIL, and AWS SAA.

His experience spans diverse sectors and includes leading large, multidisciplinary teams (up to 86 people). He has also served as a Development Team Lead while cooperating with global teams spanning North America, Europe, and Asia-Pacific. He has spearheaded the design of an industry cloud platform. This work was often conducted within global Fortune 500 environments like IBM, Citi and Panasonic.

Following a recent Master’s degree from an English-speaking university in Hong Kong, he launched this platform to share advanced, practical technical knowledge with the global developer community.


About This Site: AWS.CertDevPro.com


AWS.CertDevPro.com focuses exclusively on mastering the Amazon Web Services ecosystem. We transform raw practice questions into strategic Decision Matrices. Led by Jeff Taakey (MBA & 21-year veteran of IBM/Citi), we provide the exclusive SAA and SAP Master Packs designed to move your cloud expertise from certification-ready to project-ready.