System hardening guidelines for Amazon EC2

One of the biggest questions we get from Clients is “Is Amazon EC2 secure” . That is like saying is my Vanilla network secure. Like anything you can take some steps to make the environment as secure as you can, such as:

– First read the Amazon Security Whitepaper and the Amazon discussion of Security processes

– Ensure the system key is encrypted at start-up

– Ensure you plan for load balancing in case an instance goes down. Ensure you understand all the security implications of this and harden any other instances.

– Test or emulate the performance of applications deployed to the cloud in all geographies where you plan to deploy them. The latency could vary greatly for each.

– Never ever allow password base authentication for shell access.

– Encrypt all network traffic always.

– Always encrypt everything stored on S3

– Encrypt file systems for Block devices

– Open only the minimum required ports

– Include no authentication information in any AMI images

– Think about how your system can be hardened and what is out there such as SELinux, PAX,  ExecShield etc

– Don’t allows any decryption keys into the cloud – understand the perils of keys and security

– Install host based intrusion detection system such as OSSEC

– Regularly backup Amazon instances and store them securely. 

– Use Security Groups. With EC2 security groups, you can completely isolate every tier, even internally to the EC2 cloud.

– Design in a way you can issue security patches to AMI instances

The nightmare scenario that you cannot cater for is is that Xen has unforeseen security issues which would allow inter-VM communication and which in essence would enable instance spying. Amazons doomsday scenario…..

Open Source Cloud and some options

There are a number of options if you want to explore open source cloud. Below we touch open just a few:

Eucalyptus framework: Eucalyptus is an open-source software infrastructure for implementing “cloud computing” on clusters. The current interface to EUCALYPTUS is compatible with Amazon’s EC2 interface, but the infrastructure is designed to support multiple client-side interfaces. EUCALYPTUS is implemented using commonly available Linux tools and basic Web-service technologies making it easy to install and maintain.

Eucalytpus

Project Caroline:  Project Caroline is Sun’s Open Source Cloud platform. At the moment it is a research project rather than a full product offering. The source code is fully available . Caroline works with Perl, Python, Ruby, PHP, and of course Java. It does not seem to have progressed as much as it should of since it was announced, but provides a full Cloud Computing stack.  Check out the Application Idea Incubator forum for ideas for use.

 

Project Caroline

Nimbus:  We mentioned Nimbus in a prior post. Nimbus is an open source toolkit that allows you to turn your cluster into an Infrastructure-as-a-Service (IaaS) cloud. Nimbus also provides an EC2 frontend. This is an implementation of EC2 WSDL that allows you to use clients developed for the real EC2 system against Nimbus based clouds.

Ganglia: Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.Ganglia is an open-source project that grew out of the University of California, Berkeley Millennium Project. Ganglia seems tailor made for EC2 but can be difficult to setup. Amazon doesn’t support multicast on their network, so the default configs for Ganglia don’t work. To use Ganglia on EC2 you need to use unicast and set the send_metadata_interval (set it to something other than 0).

Ganglia

ParkPlace: ParkPlace is an S3 clone. Park Place purports to be a complete implementation of the S3 REST API. I’m not sure how Amazon will feel about this but just like the EC2 clones it can enable you to have your own private cloud data implementation, but without the ability to scale. Still, useful for testing.

And now there be Science Clouds

Nimbus is an EC2 implementation to provide compute cycles in the cloud for scientific communities. Nimbus is an open source toolkit that allows you to turn your cluster into an Infrastructure-as-a-Service (IaaS) cloud. NImbus provides:

  • Two sets of Web Service interfaces: Amazon EC2 WSDLs and Grid community WSRF
  • Implementation based on the Xen hypervisor (KVM coming soon)
  • Can be configured to use familiar schedulers like PBS or SGE to schedule virtual machines
  • Launches self-configuring virtual clusters with one click



 The Nimbus cloud client allows you to provision customized compute nodes (that they call “workspaces”) that you have full control over using a leasing model based on the Amazon’s EC2 service.

Nimbus

Amazon EC2 Network and S3 performance

When  a distributed application is running in-house the IT has a lot of control over the environment. They know exactly about the hardware and resources available:

  • network bandwith available
  • network latency

Now move to EC2 and other than some vague figures there is very little documented.

The guys at RighScale did some tests for ‘EC2-EC2’ bandwidth and ‘EC2-S3’ bandwidth which is very informative. To test ‘EC2-EC2’ large instance bandwidth they used apache(non-SSL) and curl(one instance acting as a server the other as a client)

“Using 1 single curl file retrieval, we were able to get around 75MB/s consistently. And adding additional curls uncovered even more network bandwidth, reaching close to 100MB/s”

They also got quite good numbers between an EC2 instance and S3 using curl to download  and upload from/to S3.

1 Curl (MB/S) Max(MB/S)
Download SSL 12.6 49.8 (8 curls)
Download non SSL 10.2 51.5 (8 curls)
Upload SSL 6.9 53.8 (12 curls)

So in both EC2-EC2 and EC2-S3 the bandwidth is quite reasonable for a general purpose application and you can get 1 Gigabit between 2 EC2 instances.

Now what about network latency? A friend of mine sent me a link to an interesting paper which contains some interesing latency information. The paper is “Benchmarking Amazon EC2 for high-performance scientific computing” . To test the latency they used mpptest to measure both network performance and latency. The results compared to infinband are an order of magnitude inferior!

latency