An interesting post asked the question why Amazon S3 is considered cheaper than the alternative – excerpt below:
With a price tag of $0.150/GB/month, storing 1TB of data costs around $150/month on Amazon S3. But this is a recurring amount. So, for the same amount of data it would cost $1800/year and $3600/2-years. And this doesn’t even include the data transfer costs.
Consider the alternative, with colocation the hardware cost of storing 1TB of data on two machines (for redundancy) would be around $1500/year. But this is fixed. And increasing the storage capacity on each machine can be done at the price of $0.1/GB. Which means that a RAID-1+redundant copies of data on multiple servers for 4TB of data could be achieved at $3000/year and $6000/2-years in a colocation facility. Whereas on S3 the same would cost $7200/year and $14,400/2-years.
Also, adding bandwidth+power+h/w replacement costs at a colocation facility would still keep the costs significantly lower than Amazon S3.
Given this math, what is the rationale behind going with Amazon S3? The Smugmug case study of 600TB of data stored on S3 seems misleading.
I do see several services that offer unlimited storage which is actually hosted on S3. For example, Smugmug, Carbonite etc. all offer unlimited storage for a fixed annual fee. Wouldn’t this send the costs out of the roof on Amazon S3?
The CEO of SmugMug responded:
Hey there, I’m the CEO & Chief Geek at SmugMug. You’re overlooking a few things:
- Amazon keeps at least 3 copies of your data (which is what you need for high reliability) in at least 2 different geographical locations. That’s what we’d do ourselves, too, if we continued to use our own storage internally. So your math is off both on the storage costs and then the costs of maintaing two or more datacenters and the networks between them.
- When Amazon reduces their prices, you instantly get all your storage cheaper. This isn’t something you get with your capital expenditure of disks – your costs are always fixed. This has upsides and downsides, but you certainly don’t get instant price breaks to your OpEx costs. When they added cheaper, tiered storage, our bill with Amazon dropped hugely.
- There’s built-in price pressure with Amazon, too. The cost of one month’s rent is roughly the same as the cost of leaving. So if it gets too expensive (or unreliable or slow or whatever your metrics are), you can easily leave. And Amazon has incentive to keep lowering prices and improving speed & reliability to ensure you don’t leave.
- CapEx sucks. It’s hard on your cashflow, it’s hard on your debt position if you need to lease or finance (we don’t, but that just means it’s even harder on our cashflow), it’s hard on taxes (amortization sucks), etc etc. I vastly prefer reasonable OpEx costs, with no debt load, which is what Amazon gets us.
- Free data transfer in/out of EC2 can be a big win, too. It is for us, anyway.
- Our biggest win is simply that it’s easy. We have a simpler architecture, a lot less people, and a lot less worry. We get to focus on our product (sharing photos) rather than the necessary evils of doing so (managing storage). We have two ops guys for a Top 500 website with over a petabyte of storage. That’s pretty awesome.
So what does this tell us ?
1. Opex is better than Capex, especially when related to something related to the intrinsic running of your business
2. The utility compute model reduces risk i.e. your cost of turning it off is the equivalent of one month of running the service.
3. The “ilities” that you get for free such as HA, redundant copies, geographical distribution etc would need to be paid for in an alternative model and are expensive to build in.
4. The flexibility is greater i.e. if you need to scale out to double capacity on demand then this is easily achievable with S3 but needs to be planned, built and executed in the alternative model.