TL;DR

Cloud storage pricing is designed to make ingress free and egress expensive. Once your data is in, moving it out costs real money. For datasets over 100 TB that are accessed frequently, on-premises NAS is often 3-5x cheaper over three years. syncopio handles the migration back: S3 to NFS or SMB, with checksum verification and zero vendor lock-in.

“Move to the cloud” sounded simple. Upload your data, pay per gigabyte, scale infinitely. The pitch was compelling, and for many workloads, cloud storage genuinely makes sense.

But for large unstructured datasets (file shares, media archives, research data, backups), the economics tell a different story once you look past the per-GB sticker price.

The Pricing Model Nobody Reads Carefully

Cloud storage pricing has three components, and most teams only budget for the first one:

Component	What you expect	What you get
Storage	$0.023/GB/month (S3 Standard)	Predictable, scales linearly
API requests	Negligible	$0.005 per 1,000 PUT, $0.0004 per 1,000 GET
Egress	Included?	$0.09/GB (AWS), $0.08/GB (Azure), $0.12/GB (Google)

That egress line is where it gets expensive. Uploading data is free. Downloading it costs $90 per terabyte on AWS. Accessing your own data costs money every time.

A Real Example: 200 TB Media Archive

A media production company stores 200 TB on S3 Standard. Their editors access about 20 TB per month for active projects.

Monthly costs:

Item	Cost
Storage (200 TB x $0.023/GB)	$4,600
Egress (20 TB x $0.09/GB)	$1,800
API requests (~5M GETs)	$2
Monthly total	$6,402
Annual total	$76,824
3-year total	$230,472

On-premises alternative:

Item	Cost
NAS hardware (200 TB usable, HA)	$40,000 - $60,000
3 years power + cooling	$5,000 - $8,000
Admin time (0.1 FTE)	$15,000 - $25,000
3-year total	$60,000 - $93,000

The on-premises option costs roughly one-third of the cloud option over three years. And the hardware is yours at the end of it.

When cloud storage does make sense

Cloud is the right choice for bursty workloads, globally distributed access, or datasets under 10 TB where admin overhead would exceed storage costs. The calculation flips for large, frequently accessed datasets in a single location.

The Seven Hidden Costs

1. Egress Fees

The big one. Every byte you download from cloud storage costs money. This includes:

Users accessing files
Backups from cloud to on-premises
Analytics or processing pipelines reading data
Migration to another provider
Disaster recovery testing

At $90/TB (AWS S3 Standard), downloading your entire 200 TB dataset once costs $18,000. That’s the exit tax.

2. API Request Costs

Every file operation is an API call. Listing a directory is a LIST request. Reading a file is a GET. Checking if a file exists is a HEAD.

For datasets with millions of small files, API costs add up fast. A migration tool scanning 50 million files to check for changes makes 50 million HEAD requests. At $0.0004 per 1,000 requests, that’s $20 per scan. Run it daily for incremental syncs and you’re spending $600/month on API calls alone.

3. Storage Class Complexity

S3 alone has seven storage classes: Standard, Intelligent-Tiering, Standard-IA, One Zone-IA, Glacier Instant Retrieval, Glacier Flexible Retrieval, and Glacier Deep Archive.

Each has different pricing for storage, retrieval, and minimum duration. Move data to Glacier to save on storage and you pay retrieval fees plus a minimum 90-day charge. Delete data before the minimum duration and you still pay for the full period.

4. Cross-Region and Cross-AZ Transfer

Data transfer between availability zones costs $0.01/GB in each direction on AWS. Between regions, it’s $0.02/GB. If your compute is in a different AZ than your storage, you’re paying transfer fees for every read.

5. Vendor Lock-in

Each cloud provider has proprietary features: S3 Select, Azure Data Lake, Google Nearline. Build your workflow around these and switching providers means rewriting your stack, not just moving files.

6. Compliance and Data Sovereignty

GDPR, DSGVO, and industry-specific regulations may require data to stay within specific jurisdictions. Cloud providers offer regional storage, but verifying compliance requires auditing where every copy of your data actually lives, including caches, CDN edges, and replication targets.

7. Unpredictable Bills

Cloud storage bills fluctuate monthly. A spike in user activity, an automated process scanning files, or a developer accidentally downloading a large dataset can cause bill surprises. On-premises costs are fixed and predictable after the initial investment.

When to Consider Cloud Repatriation

Repatriation makes financial sense when:

Large datasets (>50 TB) that are primarily accessed from one location
High egress volumes where monthly download exceeds 10% of stored data
Predictable workloads without significant burst requirements
Long retention periods where 3-5 year TCO favors on-premises
Compliance requirements that are simpler to satisfy on-premises

Repatriation does NOT make sense when:

Data is accessed globally from multiple locations
Workloads are bursty and unpredictable
Datasets are small (under 10 TB)
You lack the staff to manage on-premises infrastructure
Disaster recovery requires geographic distribution

The Migration Path: S3 to NAS

If you’ve decided to bring data back, the technical migration is straightforward but requires planning:

Inventory your data to understand volume, file types, and access patterns
Provision your destination NAS with sufficient capacity and the right protocol (NFS for Linux, SMB for Windows)
Plan your bandwidth to avoid saturating your internet connection during business hours
Transfer with verification to ensure every file arrives intact
Validate checksums before decommissioning the cloud storage
Keep cloud storage active until verification is complete

syncopio handles cloud repatriation

syncopio migrates from S3-compatible object storage (AWS S3, MinIO, Wasabi, Backblaze B2) to NFS or SMB shares. Distributed workers maximize throughput, and built-in checksum verification confirms every file arrived intact before you delete the cloud copy.

The Hybrid Approach

Full repatriation isn’t always the answer. Many organizations land on a hybrid model:

Hot data on local NAS for fast access and zero egress fees
Cold data on S3 Glacier or similar for archival at low storage cost
Burst capacity in the cloud for temporary workloads

The key is making deliberate placement decisions based on access patterns and cost, not defaulting to “everything in the cloud” because that was the original migration plan.

Bottom Line

Cloud storage is a tool, not a strategy. For the right workloads, it’s the best option available. For large, frequently accessed datasets in a single location, the math often doesn’t work out.

Run the numbers for your specific situation. Compare the 3-year TCO including egress, API calls, and storage class complexity against on-premises alternatives. The answer might surprise you.

The Hidden Costs of Cloud Storage Nobody Warns You About