Is Amazon S3 really cheaper than the alternative ?

An interesting post asked the question why Amazon S3 is considered cheaper than the alternative  – excerpt below:

With a price tag of $0.150/GB/month, storing 1TB of data costs around $150/month on Amazon S3. But this is a recurring amount. So, for the same amount of data it would cost $1800/year and $3600/2-years. And this doesn’t even include the data transfer costs.

Consider the alternative, with colocation the hardware cost of storing 1TB of data on two machines (for redundancy) would be around $1500/year. But this is fixed. And increasing the storage capacity on each machine can be done at the price of $0.1/GB. Which means that a RAID-1+redundant copies of data on multiple servers for 4TB of data could be achieved at $3000/year and $6000/2-years in a colocation facility. Whereas on S3 the same would cost $7200/year and $14,400/2-years.

Also, adding bandwidth+power+h/w replacement costs at a colocation facility would still keep the costs significantly lower than Amazon S3.

Given this math, what is the rationale behind going with Amazon S3? The Smugmug case study of 600TB of data stored on S3 seems misleading.

I do see several services that offer unlimited storage which is actually hosted on S3. For example, Smugmug, Carbonite etc. all offer unlimited storage for a fixed annual fee. Wouldn’t this send the costs out of the roof on Amazon S3?

The CEO of SmugMug responded:

Hey there, I’m the CEO & Chief Geek at SmugMug. You’re overlooking a few things:

– Amazon keeps at least 3 copies of your data (which is what you need for high reliability) in at least 2 different geographical locations. That’s what we’d do ourselves, too, if we continued to use our own storage internally. So your math is off both on the storage costs and then the costs of maintaing two or more datacenters and the networks between them.

– When Amazon reduces their prices, you instantly get all your storage cheaper. This isn’t something you get with your capital expenditure of disks – your costs are always fixed. This has upsides and downsides, but you certainly don’t get instant price breaks to your OpEx costs. When they added cheaper, tiered storage, our bill with Amazon dropped hugely.

– There’s built-in price pressure with Amazon, too. The cost of one month’s rent is roughly the same as the cost of leaving. So if it gets too expensive (or unreliable or slow or whatever your metrics are), you can easily leave. And Amazon has incentive to keep lowering prices and improving speed & reliability to ensure you don’t leave.

– CapEx sucks. It’s hard on your cashflow, it’s hard on your debt position if you need to lease or finance (we don’t, but that just means it’s even harder on our cashflow), it’s hard on taxes (amortization sucks), etc etc. I vastly prefer reasonable OpEx costs, with no debt load, which is what Amazon gets us.

– Free data transfer in/out of EC2 can be a big win, too. It is for us, anyway.

– Our biggest win is simply that it’s easy. We have a simpler architecture, a lot less people, and a lot less worry. We get to focus on our product (sharing photos) rather than the necessary evils of doing so (managing storage). We have two ops guys for a Top 500 website with over a petabyte of storage. That’s pretty awesome.

So what does this tell us ?

1. Opex is better than Capex, especially when related to something related to the  intrinsic running of your business

2. The utility compute model reduces risk i.e. your cost of turning it off is the equivalent of one month of running the service.

3. The “ilities” that you get for free such as HA, redundant copies, geographical distribution etc would need to be paid for in an alternative model and are expensive to build in.

4. The flexibility is greater i.e. if you need to scale out to double capacity on demand then this is easily achievable with S3 but needs to be planned, built and executed in the alternative model.

Be Sociable, Share!

5 thoughts on “Is Amazon S3 really cheaper than the alternative ?

  1. Having lived the CapEx problem for the past decade and studied alternatives to it for the last 5 years. Smugmug CEO is spot on regarding the broad scope of one’s technology acquisition model. There are two aspects that neither explicitly speak to though:

    1) Labor to support the data center has to be accounted. It represents a significant and growing percentage of the total cost to provide a TB/yr. Along with the Energy Cost, they do not obey Moore’s Law.

    2) The sociological change to computing is significant when our collective (yours and mine) data/processing can coexist and be integrated in a LAN computing environment while independently we can control our own costs. We can share our data and process at LAN latencies without one of us having to take the risk of capitalizing the infrastructure environment.

    The second aspect is not spoken to a great deal yet but it is our impression this will become the dominate aspect that will drive the use of cloud computing over the capitalization/risk reduction aspect of acquiring and maintaining technology currently touted today.

    To a great extent computing advances have been about realizing new business efficiencies and capabilities. The Cloud Strategy is just the next step on both fronts in this great evolutionary process.

  2. Hi,

    Carbonite doesn’t use S3 — we have our own data centers using our own proprietary systems based on RAID6. RAID6, the way we’re using it, is about 36 million times more reliable than an individual hard drive because of the redundancy. We certainly could not afford to offer unlimited backup for $50 per year if we were using S3! Storing large amounts of data cheaply and reliably is one of our most potent competitive weapons. We back up over 100 million new files every day, so we need to be very good at this.

    David Friend, CEO
    Carbonite, Inc.

  3. Interesting, but how do the data centre and staff costs for those hands factor into this. All prior studies I’ve seen on renting racks, by the time you build in your power, staff, servers, bandwidth etc make the per Gigabyte storage costs of Amazon S3 very viable.

    Even home backup storage seems to work out cheaper once you factor in all the costs – see:

    I think the sticking point is it being “unlimited” backup storage. I’m thinking that your business model would break down if you had to stick to a $50 price and your users pushed this to the limit. I’m guessing that you know very well the average space that your users actually use which allows you to carve a tidy profit out of the $50 price tag.

    Also of course there is the bandwidth as well as the actual storage. 1 Terrabyte of bandwidth usage on S3 comes in at around $170 . Now this is expensive and can blow up any cost savings model that the storage may provide. For example offers the same bandwidth or $60.

    Ultimately I think the larger the use of storage, such as Carbonite’s usage, which again I’m guessing to be hundreds of terratbyte’s, the easier it becomes to argue a case against Amazon S3. For a lot of organisations who do not have such extreme requirements though it is still likely to cheaper and easier to setup than the alternative.

  4. Interesting topic. We have worked with quite a few clients(we are a outsourced pd firm) and have seen both cloud storage as aswell as home grown backends to store vast amounts of data.

    Eventually you realize few things:

    1. Both have their pros and cons
    2. S3 will work for firms that do not offer storage as their core business proposition
    3. Home grown solutions for large file storage are extremely expensive to design, build and operate, unless that’s your core competency you wldn’t wanna get into it

Leave a Reply

Your email address will not be published. Required fields are marked *