Skip to main content

AWS DVA-C02 Drill: DynamoDB GSI - Flexible Query Patterns Beyond the Partition Key

Jeff Taakey
Author
Jeff Taakey
21+ Year Enterprise Architect | AWS SAA/SAP & Multi-Cloud Expert.

Jeff’s Note
#

Jeff’s Note
#

“Unlike generic exam dumps, ADH analyzes this scenario through the lens of a Real-World Lead Developer.”

“For DVA-C02 candidates, the confusion often lies in choosing between LSI and GSI when extending query capabilities. In production, this is about knowing exactly when partition key changes require GSI vs. when sort key variations allow LSI. The critical distinction? LSIs share the base table’s partition key; GSIs create entirely new key schemas. Let’s drill down.”

The Certification Drill (Simulated Question)
#

Scenario
#

TechCart Solutions operates a cloud-native marketplace platform. A developer is architecting the orders database using Amazon DynamoDB. The current table design uses OrderID as the partition key to ensure fast lookups for individual order tracking. However, the customer support team needs a new capability: retrieve all orders placed by a specific customer using their email address in a single, efficient query operation. Additionally, the product roadmap indicates future requirements to query orders by other attributes such as fulfillment status, warehouse location, or delivery date.

The Requirement
#

Implement a solution that enables querying all order IDs associated with a customer’s email address while maintaining the flexibility to add query patterns based on other item attributes without restructuring the base table.

The Options
#

  • A) Configure the partition key to use the customer email address as the sort key
  • B) Update the table to use the customer email address as the partition key
  • C) Create a local secondary index (LSI) with the customer email address as the sort key
  • D) Create a global secondary index (GSI) with the customer email address as the partition key

Google adsense
#


Correct Answer
#

Option D.

Quick Insight: The Query Flexibility Imperative
#

For Developers: DynamoDB query operations are restricted to the partition key (and optionally, sort key) of either the base table or an index. When you need to query by an attribute that isn’t part of your base table’s key schema, you must create an index. The choice between LSI and GSI depends on whether you’re keeping the same partition key (LSI) or introducing a completely different one (GSI). Here, querying by CustomerEmail requires a new partition key, making GSI the only viable option.


Content Locked: The Expert Analysis
#

You’ve identified the answer. But do you know the implementation details that separate a Junior from a Senior?


The Expert’s Analysis
#

Correct Answer
#

Option D: Create a global secondary index (GSI) with the customer email address as the partition key

The Winning Logic
#

This solution correctly addresses both requirements through DynamoDB’s GSI architecture:

Primary Requirement Fulfillment:

  • GSIs allow you to define an entirely different partition key schema from the base table
  • By creating a GSI with CustomerEmail as the partition key, you enable direct queries: Query operation against the GSI where CustomerEmail = '[email protected]'
  • This returns all orders for that customer in a single, efficient query (avoiding expensive Scan operations)

Developer-Specific Implementation Details:

# AWS SDK for Python (Boto3) - Creating the GSI
dynamodb = boto3.client('dynamodb')

response = dynamodb.update_table(
    TableName='Orders',
    AttributeDefinitions=[
        {'AttributeName': 'CustomerEmail', 'AttributeType': 'S'}
    ],
    GlobalSecondaryIndexUpdates=[
        {
            'Create': {
                'IndexName': 'CustomerEmailIndex',
                'KeySchema': [
                    {'AttributeName': 'CustomerEmail', 'KeyType': 'HASH'}
                ],
                'Projection': {'ProjectionType': 'ALL'},
                'ProvisionedThroughput': {
                    'ReadCapacityUnits': 5,
                    'WriteCapacityUnits': 5
                }
            }
        }
    ]
)

# Querying the GSI
response = dynamodb.query(
    TableName='Orders',
    IndexName='CustomerEmailIndex',
    KeyConditionExpression='CustomerEmail = :email',
    ExpressionAttributeValues={
        ':email': {'S': '[email protected]'}
    }
)

Future Extensibility:

  • GSIs can be added or removed without impacting the base table structure
  • You can create up to 20 GSIs per table (default quota)
  • Each GSI can have its own partition key and optional sort key, enabling diverse access patterns
  • Example: Add FulfillmentStatusIndex (partition: Status, sort: OrderDate) later without table migration

The Trap (Distractor Analysis)
#

Why not Option A (Configure partition key to use email as sort key)?

  • Syntactic Impossibility: This option contains a logical contradiction. The partition key and sort key are distinct attributes in DynamoDB’s key schema. You cannot configure the partition key to simultaneously “use” another attribute as the sort key.
  • API Reality: The KeySchema parameter in CreateTable or UpdateTable accepts an array where each element specifies AttributeName and KeyType (HASH for partition, RANGE for sort). You cannot nest one key type within another.

Why not Option B (Update table to use email as partition key)?

  • Destructive Migration Required: DynamoDB does not allow in-place modification of the base table’s partition key. You would need to:
    1. Create a new table with the new key schema
    2. Migrate all existing data (potentially billions of items)
    3. Update all application code referencing the old table
    4. Delete the old table
  • Loss of Access Pattern: The original OrderID-based queries would break. You’d lose fast lookups by order ID, which is essential for order tracking, updates, and customer service operations.
  • Violates Single Responsibility: The base table should optimize for the primary access pattern (order tracking). Email-based queries are a secondary pattern.

Why not Option C (Create LSI with email as sort key)?

  • LSI Constraint Violation: Local Secondary Indexes must share the same partition key as the base table. An LSI allows you to define an alternate sort key while keeping the base table’s partition key.
  • Technical Reality: In this scenario, the base table uses OrderID as the partition key. An LSI would allow queries like: Query where OrderID = 'ORD-12345' AND CustomerEmail = '[email protected]'. This doesn’t solve the requirement because you need to query by email alone, not email + specific order ID.
  • Creation Timing: LSIs must be created at table creation time; they cannot be added to existing tables (unlike GSIs).

The Technical Blueprint
#

# Complete Implementation: DynamoDB Table with GSI for Email-Based Queries

import boto3
from boto3.dynamodb.conditions import Key

dynamodb = boto3.resource('dynamodb')

# Step 1: Add GSI to existing table (or include in CreateTable)
table = dynamodb.Table('Orders')

# Note: In production, use update_table API shown earlier
# This demonstrates querying an existing GSI

def get_customer_orders(customer_email):
    """
    Query all orders for a given customer email using GSI
    """
    try:
        response = table.query(
            IndexName='CustomerEmailIndex',
            KeyConditionExpression=Key('CustomerEmail').eq(customer_email),
            # Optional: Add FilterExpression for additional filtering
            # FilterExpression=Attr('OrderStatus').eq('PENDING')
        )
        
        orders = response['Items']
        
        # Handle pagination for large result sets
        while 'LastEvaluatedKey' in response:
            response = table.query(
                IndexName='CustomerEmailIndex',
                KeyConditionExpression=Key('CustomerEmail').eq(customer_email),
                ExclusiveStartKey=response['LastEvaluatedKey']
            )
            orders.extend(response['Items'])
        
        return orders
    
    except Exception as e:
        print(f"Error querying customer orders: {str(e)}")
        raise

# Step 2: Future extensibility - Add another GSI for fulfillment status
def add_status_index():
    """
    Example of adding another GSI for future query patterns
    """
    client = boto3.client('dynamodb')
    
    response = client.update_table(
        TableName='Orders',
        AttributeDefinitions=[
            {'AttributeName': 'FulfillmentStatus', 'AttributeType': 'S'},
            {'AttributeName': 'OrderDate', 'AttributeType': 'S'}
        ],
        GlobalSecondaryIndexUpdates=[
            {
                'Create': {
                    'IndexName': 'StatusDateIndex',
                    'KeySchema': [
                        {'AttributeName': 'FulfillmentStatus', 'KeyType': 'HASH'},
                        {'AttributeName': 'OrderDate', 'KeyType': 'RANGE'}
                    ],
                    'Projection': {
                        'ProjectionType': 'INCLUDE',
                        'NonKeyAttributes': ['OrderID', 'CustomerEmail', 'TotalAmount']
                    },
                    'ProvisionedThroughput': {
                        'ReadCapacityUnits': 5,
                        'WriteCapacityUnits': 5
                    }
                }
            }
        ]
    )
    return response

# Usage Example
if __name__ == '__main__':
    customer_orders = get_customer_orders('[email protected]')
    print(f"Found {len(customer_orders)} orders for customer")

The Comparative Analysis
#

Option API Complexity Performance Impact Future Flexibility Correct Usage Scenario
A) Partition key with email as sort key Invalid (Logical Error) N/A N/A None - syntactically impossible in DynamoDB
B) Change partition key to email High (Requires table migration) Breaks existing queries Low (Loses OrderID access pattern) When email is genuinely the primary access pattern and OrderID lookups are rare
C) LSI with email as sort key Medium (Must be defined at table creation) Good (Co-located with base table) Limited (Cannot query by email alone) When you need to query: OrderID = X AND CustomerEmail = Y (composite query with same partition)
D) GSI with email as partition key Low (Can be added anytime) Excellent (Direct email-based queries) High (Unlimited additional GSIs) When you need independent query patterns by attributes other than the base table’s partition key

Real-World Application (Developer Insight)
#

Exam Rule
#

“For the DVA-C02 exam, when you need to query DynamoDB by an attribute that isn’t your partition key, and that attribute will serve as the primary lookup for the new access pattern, always choose Global Secondary Index (GSI). If the question mentions ‘future flexibility’ or ‘other attributes,’ GSI is the definitive answer.”

Real World
#

“In production systems, GSI design becomes a capacity planning exercise. Each GSI consumes its own read/write capacity (if using provisioned mode) or contributes to on-demand costs. We’ve seen cases where developers create 15+ GSIs for a single table, leading to write amplification issues—every item write must update all applicable GSIs.

Best Practice Approach:

  1. Sparse Indexes: Use GSIs with conditional writes. If an attribute isn’t present, it won’t appear in the GSI, saving storage and WCUs.
  2. Projection Strategy: Use KEYS_ONLY or INCLUDE projections instead of ALL to minimize storage costs.
  3. Monitoring: Track UserErrors CloudWatch metric for ProvisionedThroughputExceededException on GSIs separately from the base table.

Example - Sparse Index Pattern:

# Item without CustomerEmail won't appear in CustomerEmailIndex
table.put_item(
    Item={
        'OrderID': 'ORD-99999',
        'WarehouseOrder': True,
        'Status': 'PENDING'
        # No CustomerEmail attribute = Not indexed in CustomerEmailIndex
    }
)

For the email lookup scenario, we typically add a GSI with KEYS_ONLY projection initially, then use BatchGetItem on the base table if we need full order details. This balances query flexibility with storage costs.”


Stop Guessing, Start Mastering
#


Disclaimer

This is a study note based on simulated scenarios for the AWS DVA-C02 exam. All company names, scenarios, and technical implementations are fictional and created for educational purposes. Always refer to official AWS documentation and the AWS Certified Developer - Associate exam guide for authoritative information.

The DevPro Network: Mission and Founder

A 21-Year Tech Leadership Journey

Jeff Taakey has driven complex systems for over two decades, serving in pivotal roles as an Architect, Technical Director, and startup Co-founder/CTO.

He holds both an MBA degree and a Computer Science Master's degree from an English-speaking university in Hong Kong. His expertise is further backed by multiple international certifications including TOGAF, PMP, ITIL, and AWS SAA.

His experience spans diverse sectors and includes leading large, multidisciplinary teams (up to 86 people). He has also served as a Development Team Lead while cooperating with global teams spanning North America, Europe, and Asia-Pacific. He has spearheaded the design of an industry cloud platform. This work was often conducted within global Fortune 500 environments like IBM, Citi and Panasonic.

Following a recent Master’s degree from an English-speaking university in Hong Kong, he launched this platform to share advanced, practical technical knowledge with the global developer community.


About This Site: AWS.CertDevPro.com


AWS.CertDevPro.com focuses exclusively on mastering the Amazon Web Services ecosystem. We transform raw practice questions into strategic Decision Matrices. Led by Jeff Taakey (MBA & 21-year veteran of IBM/Citi), we provide the exclusive SAA and SAP Master Packs designed to move your cloud expertise from certification-ready to project-ready.