Using Amazon Elastic Block Storage (EBS) to secure data

Amazon EBSAmazon Elastic Block Store (EBS) provides raw block-level storage that can be attached to Amazon EC2 instances. These block devices can then be used like any raw block device.

There are two ways that data can be protected when using EBS:

  1. Use the Amazon Identity Management Service to control access Elastic Block Store Volumes. This can be complimented with policy options that enforce policies such as multi-factor authentication, SSL links etc in addition to controlling or locking down originating IP addresses.
  1. Encrypted EBS volumes can be created.These encrypt data at rest.Note that once set an encrypted volume cannot later be unencrypted (the same as an unencrypted volume cannot be later encrypted).

DropBox is just a frontend to Amazon S3 with a killer sync feature

Musing about iCloud, the forthcoming SkyDrive integration into Windows 8, and Google Drive  got me thinking about DropBox, the company whose business model is built on charging when everyone else is starting to give large amounts of storage away for free. DropBox killer feature is their sync replication. It just works, and consumers have shown they love the simplicity of it. However Apple have replicated the simplicity of the sync, albeit only for iOS users, and Microsoft are now close to the same with Live Mesh.

DropBox store the files you give them on Amazon S3. This surprises many people who had assumed that they are stored on DropBox Servers. This means that the entire DropBox business model is beholden to Amazon Web Services. Amazing when you think about it, and highly illustrative that what DropBox really brings to the table is great software with a killer feature, but what is going to happen when every one else has that killer feature, with 10x to 20x more storage for free?

recent article had DropBox valued at 4 billion dollars . This is a valuation on a company doing revenues between 100-200 million dollars per year in which investors have poured in 257 million dollars in funding. Perhaps it’s me, but I just don’t see it. Yes, they have a gazillion subscribers but so what? In a commodised industry that struggles to convert more than 2% of the user base, why should that get me excited? But there is DropBox Teams for businesses right? Ever used it? Then try it and you won’t need me to draw a conclusion.

So what for DropBox if there is no mega IPO coming along? They turned down Mr Jobs (a mistake), so who else would be interested? What about Amazon? After all DropBox really is the ultimate sync client for Amazon S3. With Amazon now looking twards  private cloud it would same a match made in heaven. As with all good things, time will tell……

Amazon Cloud is now FISMA certified: Joins Google and Microsoft

The Amazon Cloud has now classed as being FISMA certified. FISMA is an acronym for Federal Information Security Management Act. FISMA sets security requirements for federal IT systems. and is a required certification for US federal government projects.

This is the third set of certifications Amazon has recently announced coming on top of VPC ISO 27001 certification and SAS 70 Type II certification.

The accreditation covers EC2 (Amazon Elastic Compute Cloud), S3 (Simple Storage Service), VPC (Virtual Private Cloud), and includes Amazon’s underlying infrastructure.

AWS’ accreditation covers FISMA’s low and moderate levels. This level of accreditation requires a set of security configurations and controls that includes documenting the management, operational and technical processes used in securing physical and virtual infrastructure, and a requirement for third-party audits.

Other vendors who recently announced FISMA certification recently where Google with Google Apps for Government and Microsoft with the Microsoft’s Business Productivity Online Suite among cloud services (although there was a spat between Microsoft and Google regarding these claims).

Expect to see further certifications as these are a pre-requisite of expansion into lucrative government and private sector contracts as vendors feels more comfortable choosing Cloud resources as commoditisation marches on.

Amazon enables easy website hosting with S3 – competes with RackSpace

In a move that has put it into direct competition with competitors such as RackSpace. Amazon has announced that you can now host your website using an Amazon S3 Account. With these new features, Amazon S3 now provides a simple and inexpensive way to host your website in one place at a very cheap price.

To get started, open the Amazon S3 Management Console, and follow these steps:

1) Right-click on your Amazon S3 bucket and open the Properties pane

2) Configure your root and error documents in the Website tab

3) Click Save

Amazon provide more information on hosting a static website on Amazon S3 here.

This is part of a trend that Amazon obviously want to encourage. They recently started an add placement from JumpBox on their free Web Services developers page to offer one click WordPress deployments, amongst other JumpBox offerings.

Is Amazon S3 becoming a de facto standard interface ?

I don’t think anyone would argue that Amazon S3 is the big bear of the Cloud market, both on the virtual cloud infrastructure and the cloud storage side of things. Amazon S3 has more than 102 billion objects stored on it as of March 2010.

As befits a dominant player the interface that Amazon exposes for Amazon S3 is becoming so widely used that it almost becoming a standard with regards to how to connect into Cloud Storage. Many new or existing players in this space already support the interface as an entry point into their Storage infrastructure. For example Google Storage supports the S3 interface, as does the private cloud vendor Eucalyptus with its Walrus offering. Also the on-premise cloud appliance vendor Mezeo recently announced support for accessing their cloud using Amazon S3, as did TierraCloud. There are other Open Source implementations as well such as ParkPlace which is an Amazon S3 clone and bittorrent service that is written in ruby.

Additional to this, the multi-cloud vendor, Storage Made Easy has implemented an S3 entry point into it’s gateway so that you can use it with normal clouds even where they do not natively support Amazon S3, such as RackSpace, Google Docs, DropBox etc.

So as far as S3 goes it seems you can pretty much access a multitude of  storage back-end’s using this API, which is not surprising as vendors want to make it easy for you to move from S3 to their proposition or they want their proposition to work with existing toolsets and program code. So is it good for cloud in general ? I guess the answer to that is both ‘yes’ and ‘no’.

‘Yes’ from the point of view that standardisation can be a good thing for customers as it gives stability and promotes interoperability. ‘No’ from the point of view that standardisation can easily stifle innovation. I’m happy to say that this is not what is occurring in the cloud storage space as the work around OpenStack and Swift demonstrates.

I think right now, S3 is as close as you will get to a de facto standard for cloud storage API interactions. It probably suits Amazon that this is the case, and it certainly suits consumers / developers. Time will tell how quickly this situations lasts.

Amazon S3, EC2 and VPC ISO 27001 certified

As well as being SAS 70 Type II-certified Amazon is now ISO 27001 certified. ISO/IEC 27001 formally outlines a management system that brings information security under management control, and mandates requirements that have to be met. Organisations that have adopted ISO/IEC 27001 may be formally audited to maintain compliance with the standard.

As stated on WikiPedia:

SO/IEC 27001 requires that management:

Systematically examine the organization’s information security risks, taking account of the threats, vulnerabilities and impacts;

Design and implement a coherent and comprehensive suite of information security controls and/or other forms of risk treatment (such as risk avoidance or risk transfer) to address those risks that are deemed unacceptable; and

Adopt an overarching management process to ensure that the information security controls continue to meet the organization’s information security needs on an ongoing basis.

“Amazon Web Services is continuing its commitment to provide further assurance of AWS security controls and practices through third-party audits and certifications such as SAS 70 Type II and ISO 27001,” said Stephen Schmidt, Chief Information Security Officer for Amazon Web Services.

“Via ISO 27001 and other certifications, we continue to provide our customers with confidence that our security controls and practices follow internationally-recognized security standards.”

You can learn more about Amazon and it’s compliance and security provisions here.

Amazon S3 showing elevated error rates

In a recent post CenterNetworks noted that the Amazon S3 service is showing elevated error rates. They noticed that several images were not loading correctly and they heard from multiple CN readers with the same issue on their sites.

They note the issues seem only to be hitting the U.S. Standard centers — other S3 centers including Northern California, Europe and Asia are functioning correctly.

Amazon S3 add RRS – Reduced Redundancy Storage

introduce a new storage option for Amazon S3 called Reduced Redundancy Storage (RRS) that enables customers to reduce their costs by storing non-critical, reproducible data at lower levels of redundancy than the standard storage of Amazon S3. It provides a cost-effective solution for distributing or sharing content that is durably stored elsewhere, or for storing thumbnails, transcoded media, or other processed data that can be easily reproduced. The RRS option stores objects on multiple devices across multiple facilities, providing 400 times the durability of a typical disk drive, but does not replicate objects as many times as standard Amazon S3 storage does, and thus is even more cost effective. Both storage options are designed to be highly available, and both are backed by Amazon S3’s Service Level Agreement.
Once customer data is stored using either Amazon S3’s standard or reduced redundancy storage options, Amazon S3 maintains durability by quickly detecting failed, corrupted, or unresponsive devices and restoring redundancy by re-replicating the data. Amazon S3 standard storage is designed to provide 99.999999999% durability and to sustain the concurrent loss of data in two facilities, while RRS is designed to provide 99.99% durability and to sustain the loss of data in a single facility.
Pricing for Amazon S3 Reduced Redundancy Storage starts at only $0.10 per gigabyte per month and decreases as you store more data. To get started using RRS and Amazon S3, visit http://aws.amazon.com/s3 or learn more by joining our May 26 webinar.
Sincerely,
The Amazon S3 Team

Amazon have introduced a new storage option for Amazon S3 called Reduced Redundancy Storage (RRS) that enables customers to reduce their costs by storing non-critical, reproducible data at lower levels of redundancy than the standard storage of Amazon S3.

It provides a cost-effective solution for distributing or sharing content that is durably stored elsewhere, or for storing thumbnails, transcoded media, or other processed data that can be easily reproduced. The RRS option stores objects on multiple devices across multiple facilities, providing 400 times the durability of a typical disk drive, but does not replicate objects as many times as standard Amazon S3 storage does, and thus is even more cost effective.

Both storage options are designed to be highly available, and both are backed by Amazon S3’s Service Level Agreement.

Once customer data is stored using either Amazon S3’s standard or reduced redundancy storage options, Amazon S3 maintains durability by quickly detecting failed, corrupted, or unresponsive devices and restoring redundancy by re-replicating the data. Amazon S3 standard storage is designed to provide 99.999999999% durability and to sustain the concurrent loss of data in two facilities, while RRS is designed to provide 99.99% durability and to sustain the loss of data in a single facility.

Pricing for Amazon S3 Reduced Redundancy Storage starts at only $0.10 per gigabyte per month and decreases as you store more data.

From a programming viewpoint to enable your storage to take advantage of RRS  you need to set the storage class of an object you upload to RRS. To enable this you set x-amz-storage-class to REDUCED_REDUNDANCY in a PUT request.

Amazon – what is coming soon, and what is not !

We had a meeting with Amazon in the UK recently and covered off some off the pressing issues that we wanted to speak about and also learnt some other of what Amazon have lined up.

First, what is not going to happen anytime soon:

– From what we heard Amazon are not going to resolve the issue of billing in local currency with locally issued invoices any time soon. See our prior post on this topic. We did learn however that large organisations can request an invoice.

– Right now if you want to use your own AMI image to sell on a SaaS basis using Amazon infrastructure you have to a US organisation. Again Amazon don’t seem to have plans to change this in the immediate timeframe so that leaves out any organisation outside of the US who want to sell their product offering as SaaS on Amazon’s web services infrastructure unless they integrate their own commerce infrastructure and not use DevPay. This can be both a blessing (charge margin on Amazon’s infrastructure pieces like AMQS) but also a curse (can leave you exposed as you will be month behind in billing your clients). Even though Amazon are entrenched right now as the Public Cloud infrastructure of choice, it wouldn’t be the first time we have seen 100 pound gorilla displaced from it’s prime market position. If I were Amazon, I’d fix this and soon. Microsoft and RackSpace are looking more attractive all the time.

– Amazon’s ingestion services again require you to be a US organisation with a US return address. Are you detecting a common theme here….

And what we can expect to see soon:

– VPC (Virtual private cloud) access is in private beta now. This is a mechanism for securely connecting public and private clouds within the EC2 infrastructure.

– High memory instances analogous to High CPU instances are in the pipeline

– Shared EBS is in the pipeline

– Functionality for Multiple users associated with a single account is in the pipeline and will provide simple privileges too. This has long been a bone of contention for organisations using AWS so will be welcomed.

– Amazon is planning to have lot more EC2 workshops through local partners.

Other things of note that we learnt where:

– We learned that large physical instances currently have their own dedicated blade / box.

– As AWS has grown, large number of machines are available and organizations can request hundreds of machines easily. Even extreme cases are catered for i.e. even requests for 50000 machines.

– As a matter of policy new functionally will be rolled out simultaneously in EU and US unless there is a good reason.

All in all some exciting stuff, and there was other things in the pipeline they could not share, but the public cloud market is starting to get more players and I think Amazon need to get some of their infrastructure pieces in place sooner rather than later.

Amazon Elastic MapReduce now available in Europe

From the Amazon Web Services Blog:

 Earlier this year I wrote about Amazon Elastic MapReduce and the ways in which it can be used to process large data sets on a cluster of processors. Since the announcement, our customers have wholeheartedly embraced the service and have been doing some very impressive work with it (more on this in a moment).

Today I am pleased to announce Amazon Elastic MapReduce job flows can now be run in our European region. You can launch jobs in Europe by simply choosing the new region from the menu. The jobs will run on EC2 instances in Europe and usage will be billed at those rates.

 Because the input and output locations for Elastic MapReduce jobs are specified in terms of URLs to S3 buckets, you can process data from US-hosted buckets in Europe, storing the results in Europe or in the US. Since this is an internet data transfer, the usual EC2 and S3 bandwidth charges will apply.

Our customers are doing some interesting things with Elastic MapReduce.

 At the recent Hadoop Summit, online shopping site ExtraBux described their multi-stage processing pipeline. The pipeline is fed with data supplied by their merchant partners. This data is preprocessed on some EC2 instances and then stored on a collection of Elastic Block Store volumes.The first MapReduce step processes this data into a common format and stores it in HDFS form for further processing. Additional processing steps transform the data and product images into final form for presentation to online shoppers. You can learn more about this work in Jinesh Varia’s Hadoop Summit Presentation.

Online dating site eHarmony is also making good use of Elastic MapReduce, processing tens of gigabytes of data representing hundreds of millions of users, each with several hundred attributes to be matched. According to an article on SearchCloudComputing.com, they are doing this work for $1,200 per month, a considerable savings from the $5,000 per month that they estimated it would cost them to do it internally.

We’ve added some articles to our Resource Center to help you to use Elastic MapReduce in your own applications. Here’s what we have so far:

 

 

You should also check out AWS Evangelist Jinesh Varia in this video from the Hadoop Summit:

— Jeff;

PS – If you have a lot of data that you would like to process on Elastic MapReduce, don’t forget to check out the new AWS Import/Export service. You can send your physical media to us and we’ll take care of loading it into Amazon S3 for you.