Using Amazon Elastic Block Storage (EBS) to secure data

Amazon EBSAmazon Elastic Block Store (EBS) provides raw block-level storage that can be attached to Amazon EC2 instances. These block devices can then be used like any raw block device.

There are two ways that data can be protected when using EBS:

  1. Use the Amazon Identity Management Service to control access Elastic Block Store Volumes. This can be complimented with policy options that enforce policies such as multi-factor authentication, SSL links etc in addition to controlling or locking down originating IP addresses.
  1. Encrypted EBS volumes can be created.These encrypt data at rest.Note that once set an encrypted volume cannot later be unencrypted (the same as an unencrypted volume cannot be later encrypted).

Amazon EBS Provisioned IOPS volumes can now store up to 16 TB

Amazon EVS 16TBFrom  today, users of Amazon Web Services can create Amazon EBS Provisioned IOPS volumes that can store up to 16 TB, and process up to 20,000 input/output operations per second (IOPS).

Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 (Elastic Compute) instances in the AWS Cloud.

Users can also create Amazon EBS General Purpose (SSD) volumes that can store up to 16 TB, and process up to 10,000 IOPS. These volumes are designed for five 9s of availability and up to 320 megabytes per second of throughput when attached to EBS optimized instances.

These performance improvements make it even easier to run applications requiring high performance or high amounts of storage, such as large transactional databases, big data analytics, and log processing systems. Users can now run large-scale, high performance workloads on a single volume, without needing to stripe together several smaller volumes.

Larger and faster volumes are available now in all commercial AWS regions and in AWS GovCloud (US). To learn more please check out the Amazon EBS details page.

Amazon – what is coming soon, and what is not !

We had a meeting with Amazon in the UK recently and covered off some off the pressing issues that we wanted to speak about and also learnt some other of what Amazon have lined up.

First, what is not going to happen anytime soon:

– From what we heard Amazon are not going to resolve the issue of billing in local currency with locally issued invoices any time soon. See our prior post on this topic. We did learn however that large organisations can request an invoice.

– Right now if you want to use your own AMI image to sell on a SaaS basis using Amazon infrastructure you have to a US organisation. Again Amazon don’t seem to have plans to change this in the immediate timeframe so that leaves out any organisation outside of the US who want to sell their product offering as SaaS on Amazon’s web services infrastructure unless they integrate their own commerce infrastructure and not use DevPay. This can be both a blessing (charge margin on Amazon’s infrastructure pieces like AMQS) but also a curse (can leave you exposed as you will be month behind in billing your clients). Even though Amazon are entrenched right now as the Public Cloud infrastructure of choice, it wouldn’t be the first time we have seen 100 pound gorilla displaced from it’s prime market position. If I were Amazon, I’d fix this and soon. Microsoft and RackSpace are looking more attractive all the time.

– Amazon’s ingestion services again require you to be a US organisation with a US return address. Are you detecting a common theme here….

And what we can expect to see soon:

– VPC (Virtual private cloud) access is in private beta now. This is a mechanism for securely connecting public and private clouds within the EC2 infrastructure.

– High memory instances analogous to High CPU instances are in the pipeline

– Shared EBS is in the pipeline

– Functionality for Multiple users associated with a single account is in the pipeline and will provide simple privileges too. This has long been a bone of contention for organisations using AWS so will be welcomed.

– Amazon is planning to have lot more EC2 workshops through local partners.

Other things of note that we learnt where:

– We learned that large physical instances currently have their own dedicated blade / box.

– As AWS has grown, large number of machines are available and organizations can request hundreds of machines easily. Even extreme cases are catered for i.e. even requests for 50000 machines.

– As a matter of policy new functionally will be rolled out simultaneously in EU and US unless there is a good reason.

All in all some exciting stuff, and there was other things in the pipeline they could not share, but the public cloud market is starting to get more players and I think Amazon need to get some of their infrastructure pieces in place sooner rather than later.

Differences between S3 and EBS

Amazon Elastic Block Storage (Amazon EBS) is a new type of storage designed specifically for Amazon EC2 instances. Amazon EBS allows you to create volumes that can be mounted as devices by EC2 instances. Amazon EBS volumes behave as if they were raw unformatted external hard drives and can be formatted using a file system such as ext3 (Linux) or NTFS (Windows) and mounted on an EC2 instance; files are accessed through the file system . They have user supplied device names and provide a block device interface.

For a 20 GB volume, Amazon estimates an annual failure rate for EBS volumes from 1-in-200 to 1-in-1000.  The failure rate increases as the size of the volume increases.  Therefore you either need to keep an up-to-date snapshot on S3, or have a backup of the contents somewhere else that you can restore quickly enough to meet your needs in the event of a failure.  

EBS accounts can have a  maximum of 20 volumes unless a higher limit is requested from Amazon. The maximum size of a volume is 1 TB and the storage on a volume is limited to the provisioned size and cannot be changed. EBS volumes can only be accessed from an EC2 instance in the same availability zone whereas snapshots on S3 can be accessed from any availability zone. 

Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers. S3 needs software to be able to read and write files but is hugely scalable, stores 6 copies of data for HA and redundancy, and is rumoured to be written in Erlang and is hugely scalable.

S3 accounts can have a maximum of 100 buckets, each with unlimited storage and an unlimited number of files. The maximum size of a single file is 5 GB.

S3 is subject to “eventual consistency”  which means that there may be a delay in writes appearing in the system whereas EBS has no consistency delays. Also EBS can only be accessed by one machine at a time whereas snapshots on S3 can be shared,  

In terms of performance S3 has the higher latency and also has higher variation in latency. S3 write latency can also be higher than read latency . EBS on the other hand has lower latency with less variation. It also has writeback caching for very low write latency. However be aware that writeback caching and out-of-order flushing could result in either an unpredictable file system or a database corruption

In terms of throughput S3 has  maximum  throughput (single threaded ) of approximately 20 MB/s or 25 MB/s  for multithreaded. This is on a small instance. This rises to 50 MB/s on the large and  extra large instances. EBS has a maximum  throughput limited by the network, This isapproximately 25 MB/s on a small instance and 50 MB/s on large instances,  and 100 MB/s on  extra large instances. As both S3 and EBS are shared resources they are subject to slowdown under heavy load.

For file listing S3 is slow and search is by prefix only  whereas EBS has fast directory listing and searching. S3 is performance optimized by using multiple buckets. The write performance is  optimized by writing keys in sorted order . EBS single volume performance is similar to a disk drive with writeback caching.

There is an alternative to EBS for EC2 and that is PersistentFS. With PersistentFS you mount a drive and use it like any other, but, and here is the crunch, the storage for the device is actually realized in many little chunks in an S3 storage bucket. PersistentFS is a closed-source product based on the FUSE approach.

S3 costs 15 cents per GB for storage actually used and 1 cent per 10,000 GETs, and 1 cent per 1,000 PUTs. EBS costs 10 cents per GB provisioned and 1 cent per 100,000 I/O’s. For a pricing of PersistentFS and how this compares to both S3 and EBS I suggest you read this post on the Amazon forums which as posted by the PersistentFS team.