Jeff’s Insights #
“Unlike generic exam dumps, Jeff’s Insights is designed to make you think like a Real-World Production Architect. We dissect this scenario by analyzing the strategic trade-offs required to balance operational reliability, security, and long-term cost across multi-service deployments.”
While preparing for the AWS SAA-C03, many candidates get confused by data migration strategies. In the real world, this is fundamentally a decision about Network Bandwidth vs. Time vs. Cost. The exam loves testing whether you understand when physical data transfer beats internet-based solutions. Let’s drill into a simulated scenario.
The Architecture Drill (Simulated Question) #
Scenario #
MediaStream Productions, a video production company, currently stores their entire video archive on an on-premises NFS-based Network Attached Storage (NAS) system. The archive contains approximately 12,000 video files ranging from 1 MB promotional clips to 500 GB raw 4K footage, totaling 70 TB of data. The archive is complete and no longer growing.
The company has decided to migrate this entire archive to Amazon S3 for long-term storage and improved accessibility for remote editing teams. However, their business internet connection is limited to 100 Mbps, which is also used for daily operations. The IT director has two critical requirements: complete the migration as quickly as possible, and minimize impact on the company’s operational bandwidth.
The Requirement: #
Migrate 70 TB of NFS data to Amazon S3 with minimal network bandwidth consumption and fastest possible completion time.
The Options #
- A) Create an S3 bucket, create an IAM role with write permissions to the bucket, and use the AWS CLI to copy all files from the on-premises NFS share directly to the S3 bucket over the internet.
- B) Create an AWS Snowball Edge job, receive the Snowball Edge device on-premises, use the Snowball Edge client to transfer data to the device, ship the device back to AWS, and have AWS import the data into Amazon S3.
- C) Deploy an S3 File Gateway on-premises, create a public service endpoint connection to the File Gateway, create an S3 bucket, create a new NFS file share on the File Gateway pointing to the S3 bucket, and transfer data from the existing NFS share to the File Gateway.
- D) Establish an AWS Direct Connect connection between the on-premises network and AWS, deploy an S3 File Gateway on-premises, create a public Virtual Interface (VIF) to the File Gateway, create an S3 bucket, create a new NFS file share on the File Gateway pointing to the S3 bucket, and transfer data from the existing NFS share to the File Gateway.
Correct Answer #
Option B.
The Architect’s Analysis #
Correct Answer #
Option B - AWS Snowball Edge.
The Winning Logic #
AWS Snowball Edge is purpose-built for large-scale, one-time data migrations where network bandwidth is the limiting factor. Here’s why it wins:
-
Bandwidth Constraint Solved: Physical data transfer completely bypasses the internet connection. Zero impact on the 100 Mbps operational bandwidth.
-
Time Efficiency:
- Device shipping: 2-3 days
- On-premises data transfer to device: 1-2 days (70 TB at local network speeds of 1-10 Gbps)
- Return shipping: 2-3 days
- AWS import process: 1-2 days
- Total: 6-10 days vs. 65+ days for internet transfer
-
Cost Effectiveness:
- Snowball Edge 80TB device: ~$250-300 per job (including 10 days of on-site usage)
- No data egress charges from on-premises
- No bandwidth upgrade costs
- No productivity loss from saturated internet connection
-
Simplicity: Single job creation, receive device, copy data, ship back. No complex network configuration.
-
Reliability: Physical transfer eliminates network interruption risks, connection timeouts, and retry logic complexity.
The Trap (Distractor Analysis) #
Why not Option A (AWS CLI Direct Copy)?
- Bandwidth Saturation: Would consume the entire 100 Mbps connection for 2-3 months
- Business Impact: Would cripple daily operations (video conferencing, cloud services, email)
- Hidden Costs: Potential need to upgrade internet connection (~$500-2000/month for enterprise fiber)
- Risk: Any network interruption requires resume logic; 70 TB is massive for retry management
- When it WOULD work: Small datasets (<1 TB), abundant unused bandwidth, no time pressure
Why not Option C (S3 File Gateway with Public Endpoint)?
- Same Bandwidth Problem: File Gateway still uses the internet connection to sync data to S3
- No Bandwidth Advantage: You’re still pushing 70 TB over the same 100 Mbps connection
- Added Complexity: Requires deploying and managing File Gateway appliance (VM or hardware)
- Ongoing Costs: File Gateway is designed for hybrid storage, not one-time migration
- Latency: Two-hop process (NFS → File Gateway → S3) adds overhead
- When it WOULD work: Hybrid cloud scenarios where you need ongoing on-premises NFS access to S3 data
Why not Option D (Direct Connect + S3 File Gateway)?
- Cost Overkill: Direct Connect setup includes:
- Port fees: $0.30/hour (~$216/month) for 1 Gbps
- Data transfer out: $0.02-0.09/GB depending on location
- Cross-connect fees from colocation provider
- Lead time: 2-4 weeks for provisioning
- Total first-month cost: ~$2,000-3,000+
- Time Penalty: Provisioning delay negates the “as quickly as possible” requirement
- Engineering Overhead: Requires network engineering for BGP, VIF configuration
- Temporary Need: Direct Connect makes sense for ongoing hybrid workloads, not one-time migration
- When it WOULD work: If the company already has Direct Connect, or plans ongoing high-bandwidth AWS integration
The Architect Blueprint #
Diagram Note: The Snowball Edge workflow completely isolates the 70 TB data transfer from the internet connection, using physical transport and AWS’s internal high-speed network for the S3 import process.
Associate-Level Service Selection Guide #
Exam Rule #
For AWS SAA-C03, apply the “Snowball Threshold” decision tree:
- Data volume < 10 TB + good bandwidth: Use AWS DataSync or Direct Copy (CLI/SDK)
- Data volume > 10 TB + limited bandwidth: Use AWS Snowball family
- Data volume > 10 PB: Use AWS Snowmobile (literal shipping container)
- Keyword triggers: “minimal bandwidth,” “limited internet,” “large-scale one-time migration” → Snowball
Real World #
In production environments, the decision gets more nuanced:
-
Hybrid Approach: Companies often use Snowball for the initial bulk load (70 TB), then set up AWS DataSync or S3 File Gateway for incremental updates if the archive starts growing again.
-
Snowball Family Selection:
- Snowball Edge Storage Optimized: 80 TB usable (this scenario)
- Snowball Edge Compute Optimized: 42 TB + GPU for edge processing
- Multiple devices: For 70 TB, one device suffices, but larger datasets can parallelize with multiple devices
-
Pre-Migration Optimization: Real architects would:
- Run deduplication analysis (video files often have duplicates)
- Consider compression (though video is already compressed)
- Evaluate S3 Intelligent-Tiering or Glacier for immediate cost savings post-migration
-
Network Reality Check: The 100 Mbps assumption is generous. Real-world throughput with:
- VPN overhead: -15-20%
- Concurrent business usage: -30-50%
- Protocol overhead (TCP, SSL): -10-15%
- Effective transfer rate: 30-50 Mbps, extending internet-based migration to 4-6 months
-
Post-Migration Architecture: After migration, MediaStream would likely implement:
- S3 Lifecycle Policies: Move infrequently accessed footage to S3 Glacier
- CloudFront: For global distribution to remote editing teams
- S3 Batch Operations: For metadata tagging and organization
Real-World Application (Practitioner Insight) #
Exam Rule #
For the AWS SAA-C03 exam: “When you see ’large dataset’ (>10 TB) + ‘minimal bandwidth’ or ’limited network’ → immediately eliminate internet-based solutions and select Snowball/Snowmobile.”
Real World #
In actual customer engagements, I’ve seen three common patterns:
-
The “We Have Fiber” Trap: A client insisted their “1 Gbps fiber” could handle 50 TB. Reality: Sustained throughput was 300 Mbps due to ISP throttling and firewall limitations. After 3 weeks of failed transfers, they requested a Snowball. Lesson: Always test actual sustained throughput before committing to internet-based migration.
-
The Snowball + DataSync Combo: A media company used Snowball for 100 TB initial load, then deployed AWS DataSync for daily incremental transfers of new footage (50-200 GB/day). This hybrid approach is the production standard.
-
The Direct Connect Justification: Direct Connect made sense for a broadcasting company that needed ongoing 10 Gbps connectivity for live streaming workflows. They amortized the setup cost across 3 years of usage. But for one-time migration? Never worth it.
-
Multi-Region Consideration: For global media companies, sometimes we ship Snowball devices to multiple regional offices simultaneously, consolidating data into a single S3 bucket with cross-region replication for disaster recovery.
Disclaimer
This is a study note based on simulated scenarios for the AWS SAA-C03 exam. It is not an official question from AWS or the certification body. The scenario, company name, and specific details have been rewritten for educational purposes while preserving core technical concepts.