GigaSpaces release 7.1 of XAP Cloud enabled Middleware – certified for use on Cisco UCS

The upcoming release of GigaSpaces XAP includes the ‘Elastic Data Grid’, which enables deploying a full clustered application with a single API call. Users basically specify their business requirements and XAP automatically performs sizing, hardware provisioning, configuration and deployment. The aim of this is to  provide simplification resulting in reduced effort and cost savings for enterprise applications that require dynamic scalability.

Other features of the XAP 7.1 release include:

  • Certified for use with Cisco UCS, providing enhanced performance
  • Built-in multi-tenancy
  • Extended in-memory querying capabilities
  • Real-time distributed troubleshooting
  • Multi-core utilization

More detail can be found from the GigaSpaces website.

Sun’s Grid Engine now features Cloud burst and Apache Hadoop Integration

Sun (or is that Oracle…) has released a new version of their Grid Engine which brings it into the cloud.

There are two main additions in this release. The First is is integration with Apache Hadoop in which Hadoop jobs can now be submitted to Grid Engine, as if they were any other computation job. The Grid Engine also understand Hadoop’s global file systems which means that the Grid Engine is able to send work to the correct part of the cluster (data affinity).

The second is dynamic resource reallocation which also includes the ability to use on-demand resources from Amazon EC2. Grid Engine also is now able to manage resources across logical clusters which can be either in Cloud or off Cloud. This means that Grid engine can now be configured to “cloud burst” dependent on load which is a great feature. Integration is specifically set up with EC2 and enables scale down as well as scale up.

This release of Grid Engine also implements a usage accounting and billing feature called ARCo, making it truly SaaS ready as it is able to cost and bill jobs.

Impressive and useful stuff, and if you are interested in finding out more you can do so here.

GigaSpaces Version 7 and Intel Nehalem deliver impressive benchmark results

GigaSpaces, in conjunction with MPI Europe, Globant, and Intel recently conducted some benchmarks on the in-memory data caching / data grid element of their version 7 XAP platform on Intel’s Nehalem chipset. XAP’s latest version 7 reached 1 million data updates per second and 2.6 million data retrievals per second with four client threads on the Nehalem Chip.

Previously XAP version 6 had benchmarked 276,000 updates per second and 453,000 retrievals per second on the best previous Intel processor. 

The summary of the tests are:

– GigaSpaces Write and Take operation from their in-memory data cache are about 3-4 times faster (300-400%) with the Nehalem chipset (in absolute numbers).

– Read operations perform much better with the Nehalem (3-6 times better with 1-4 threads). As much as there are more concurrent threads the difference is increasing. According to Shay Hassidim, one of the reasons for this is the non lock read capability added to XAP 7.0.

– Nehalem+ XAP 7.0.1 shows better scalability than Dunnington+XAP 6.2.2. About 30 % better with write and take operations and growing numbers with read operations (90% with 10 threads).

GigaSpaces continues to push the speed and performance envelope with its product, and I’m informed that the 7.02 version of GigaSpaces has again been highly performance tuned and is even faster than the 7.0 platform that was used for this benchmark.

It will be interesting to see if other vendors in this space publish results of their product on Nehalem, which looks set to deliver a huge performance jump.

GigaSpaces finds a place in the Cloud

a new report from analysts The 451 Group outlines the success to date  that GigaSpaces has had in the Cloud Sector. The report talks about how GigaSpaces now has 76 customers using its software on cloud-computing platforms. This is up from 25 on Amazon’s EC2 in February. GigaSpaces have moved forward their cloud strategy in recent weeks, announcing support for deployments on GoGrid and also recently announcing tighter integration with VMWare which enables GigaSpaces to dynamically manage and scale VMWare instances and enable them to participate in the scaling of GigaSpaces hosted applications.

GigaSpaces have a number of hybrid deployments in which their application stack is hosted in the cloud and the data or services are hosted on premise which have had some notable successes.

The GigaSpaces product provides a strong Cloud middleware stack which encompasses Logic, data, services and messages in memory underpinned by a real-time Service Level Agreement enforcement which functions at the application level enabling the stack to scale up and out in real time based on SLA’s set by the business. As everything is held in memory, it is faster than alternative ways of trying to build enterprise scale applications in the cloud, and it has sophisticated sync’ services that enable async (or sync) of data to a DB or persistent store.

Supporting SLA’s on the Cloud

What does it take to make a Cloud Computing infrastructure enterprise ready ? Well, as always, this probably depends on the use case, but support for real-time scaling and SLA support must figure highly.

Software that purports to scale the applications on the cloud is not new, have a look at our prior blog post on this topic, and you will see some of the usual suspects such as RightScale, and Scalr. A new offering in this space is by Tibco with its Tibco Silver offering. Tibco Silver is trying to solve the problem of not whether cloud services can scale but whether the applications themselves can scale with them. This problem is addressed by Silver through ‘self aware elasticity’. Hmmm….sounds good but what exactly does that mean ? It means the system can automatically provision new cloud capacity (be that storage or compute) dependent upon fluctuations in application usage.

According to Tibco, unlike services in a service-oriented architecture cloud services are not aware of the SLA’s to which they are required to adhere and Tibco Silver is aimed at providing this missing functionality. Tibco claim that “Self-aware elasticity” is something no other vendor has developed. I would dispute this. GigaSpaces XAP with it’s ability to deploy to the cloud as well as on-premise using the same technology has very fine grained application level SLA control that when breached allows the application to react accordingly, whether this be to increase the number of threads, provision new instances or to distribute workloads in a different way. GigaSpaces Service Grid technology enables support for this real-times elasticity.  The GigaSpaces Service Grid originated from Sun’s RIO Project. (interestingly it seems GigaSpaces are doing some work on enabling their cloud tools to deploy to and manage VMWARE images on private clouds as they do with AMI’s on Amazon’s public cloud) 

Without a doubt the ability to react in real-time to application level SLA’s rather than just breaches of an SLA at an infrastructure level is something that will find a welcome home in both private and public clouds.

Practical Guide for Developing Enterprise Applications for the Cloud

This session was presented at Cloud Slam 09 by Nati Shalom CTO of GigaSpaces. It provides a practical guideline addressing the common challenges of developing and deploying an existing enterprise application on the cloud. Additionally, you will get the opportunity for hands-on experience running and deploying production ready applications in a matter of minutes on Amazon EC2.

Cloud Best Practice – what to be aware of !

Some of the key things to think about when putting your application on the cloud are discussed below. Cloud computing is relatively new, and best practice is still being established. However we can learn from earlier technologies and concepts such as utility compute, SaaS, outsourcing and even internal enterprise centre management, as well as from experience with vendors such as Amazon and FlexiScale.

Licensing: If you are using the cloud for spikes or overspill make sure that the products you want to use in the cloud can be used in this way. Certain products restrict their licenses to be used from a cloud perspective. This is especially true of commercial Grid, HPC or DataGrid vendors.

Data transfer costs:  When using a provider like Amazon with a detailed cost model, make sure that any data transfers are internal to the provider network rather than external. In the case of Amazon, internal traffic is free but you will be charged for any traffic over the external IP addresses.

Latency: If you have low latency requirements then the Cloud may not be the best environment to achieve this. If you are trying to run an ERP or some such system in the cloud then the latency may be good enough but if you are trying to run a binary or FX Exchange then of course the latency requirements are very different and more stringent. It is essential to make sure you understand the performance requirements of your application and have a clear understanding of what is deemed business critical.

One Grid vendor who has focused on attacking low latency in the cloud is GigaSpaces and so if you require cloud low latency then these are one of the companies you should evaluate. Also for processing distributed data loads there is the map reduce pattern and Hadoop. These type of architectures eliminating the boundaries created by scale-out database based approaches.

State: Check whether your cloud infrastructure providers have persistence.  When an application is brought down and then back up all local changes will be wiped and you start with a blank slate. This obviously has ramifications with instances that need to store user or application state.  To combat this on their platform Amazon rdelivered EC2 persistent storage in which data can remain linked to a specific computing instance. You should ensure you understand the state limitations of any Cloud Computing platform that you work with.

Data Regulations: If you are storing data in the cloud you may be breaching data laws depending where your data is stored i.e. which country or continent.  To combat this Amazon S3 now supports location constraints, which allow you to specify where in the world to store data for a bucket and provides a new API to retrieve the location constraint for an existing bucket. However if you are using another cloud provider you should check where your data is stored.

Dependencies:  Be aware of dependencies of service providers. If service ‘y’ is dependant on ‘x’ then if you subscribe to service ‘y’ and service ‘x’ goes down you lose your service. Always check any dependencies when you are using a cloud service.

Standardisation: A major issue with current cloud computing platforms is that there is no standardisation of the APIs and platform technologies that underpin the services provided. Although this represents a lack of maturity you need to consider how locked in you are when considering a Cloud platform or migrating between cloud computing platforms will be very difficult if not impossible. This may not be an issue if your supplier is IBM and always likely to be IBM, but it will be an issue if you are just dipping your toe in the water and discover that other platforms are better suited to your needs.

Security: Lack of security or apparent lack of security is one of the perceived major drawbacks of working with Cloud platform and Cloud technology. When moving sensitive data about or storing it in public cloud it should be encrypted. And it is important to consider a secure ID mechanism for authentication and authorisation for services. As with normal enterprise infrastructures only open the ports needed and consider installing a host based intrusion detection systems such as OSSEC. The advantage of working with an enterprise Cloud provider, such as IBM or Sun is that many of these security optimisations are already taken care of. See our prior blog entry for securing n-tier and distributed applications on the cloud.

Compliance:  Regulatory controls mean that certain applications may not be able to deployed in the Cloud. For example the US Patriot Act could have very serious consequences for non-US firms considering U.S. hosted cloud providers. Be aware that often cloud computing platforms are made up of components from a variety of vendors who may themselves provide computing in a variety of legal jurisdictions. Be very aware of the dependencies and ensure you factor this into any operational risk management assessment. See also our prior blog entry on this topic

Quality of service: You will need to ensure that the behaviour and effectiveness of the cloud application that you implement can be measured and tracked both to meet existing or new Service Level agreements. We have discussed previously some of the tools that come with this option built in (GigaSpaces) and other tools that provide functionality that enable you to use this with your Cloud Architecture (RightScale, Scalr etc). Achieving Quality of Service will encompass scaling, reliability, service fluidity, monitoring, management and system performance.

System hardening: Like all enterprise application infrastructures you need to harden the system so that it is secure, robust, and achieves the necessary functional requirements that you need. See our prior blog entry on system hardening for Amazon EC2.

Content adapted from “TheSavvyGuideTo HPC, Grid, DataGrid, Virtualisation and Cloud Computing” available on Amazon.

Sun’s Cloud Computing Vision

The below presentation is a very good presentation on Sun’s Cloud Computing Vision. It covers all areas, including public and private cloud, as well as open source initiatives. It’s a very good intro’ to get the skinny on where Sun is right now with Cloud and where it intends to focus and drive forward.

Is average utilisation of servers in Data Centers really between 10 and 15% ?

Server RacksThere has been  an interesting discussion occurring on The Cloud Computing forum hosted on Google Groups (and if you are all interested in Cloud I recommend you join this as it really does have some excellent discussions). What has been interesting about it from my viewpoint is that there is a general consensus that the average CPU utilisation in organisational data centre’s runs between 10 and 15%. Some snippets of the discussion below:


Initial Statement on the Group discussion

The Wallstreet Journal article “Internet Industry is on a Cloud” does not do Cloud computing any justice at all.

First: Value proposition of Cloud computing is crystal clear. Averaged over 24 hours, and 7 days a week , 52 weeks in a year most servers have a CPU utilization of 1% or less.  The same is also true of network bandwidth. The storage capacity on harddisks that can be accessed only from a specific servers is also underutilized. For example, harddisk capacity of hard disks attached to a database server, is used only when certain queries that require intermediate results to be stored to the harddisk.  At all other times the hard disk capacity is not used at all.

First response to the statement above on the group

Utilization of *** 1 % or less *** ???

Who fed them this? I have seen actual collected data from 1000s of customers showing server utilization, and it’s consistently 10-15%. (Except mainframes.) (But including big proprietary UNIX systems.)

2nd Response:

Mea Culpa. My 1% figure is not authoritative.  It is based on my experience with a specific set of servers: 

J2EE Application Servers: Only one application is allowed per cluster of servers. So if  you had 15% utilization when you designed the application 8 years ago, on current servers it could be 5% or less.  With applications that are used only few hours per week,  1% is certainly possible.  The other set of servers for which utilization is really low are: departmental web servers and mail servers.

3rd Response 

Actually, it was across a very large set of companies that hired IBM Global Services to manage their systems. Once a month, along with a bill, each company got a report on outages, costs, … and utilization.

A friend of mine heard of this, and asked “are you, by any chance, archiving those utilization numbers anywhere?” When the answer came back “Yes” — you can guess the rest. He drew graphs of # of servers at a given utilization level. He was astonished that for every category of server he had data on, the graphs all peaked between 10% and 15%. In fact, the mean, the median, and the mode of the distributions were all in that range. Which also indicates that it’s a range. Some were nearer zero, and some were out past 90%. That yours was 1% is no shock. 

4th Response:

This is no surprise for me, as HPC packages like Sun Grid Engine working on batch jobs can increase close to 90% utilization. We had data that without a workload manager of sorts, the average utilization is 10% to 15%, confirming what you discovered.

This means world wide, 85% to 90% of the installed computing capacity is sitting idle. Grids improved this utilization rate dramatically, but grid adoption was limited. 

If this is not an argument for virtualisation in private data centers / clouds then I don’t know what is. It should also be a big kicker for those who who are considering moving applications to public clouds, out of the data centre and the racks of machines spinning their wheels. It is also a good example of companies planning for Peak capacity (see our previous blog on this). What is really needed is scale on demand and hybrid cloud / Grid technologies such as GigaSpaces which can react to Peak loading in real-time. Consider not only the wasted cost but also the “Green computing” cost for the running of hordes of machines running at 15% capacity….


Is it Grid or is it Cloud ?

recent post by the Cloud vendor CohesiveFT talks about the potential changes in technical sales cycles when evaluating Grid based products. I’m not sure I agree totally with the article, but the ethos behind the article i.e. making it easier to trial products, try out solutions and build apps /services quicker to be build internal business cases is solid.

Cloud is a game changer, which is the intent of the article, but you cannot apply a broad brush to “Grid on the Cloud” as a unilateral game changer  in respect of Cloud replacing Grid (which to be fair is not the intent of the article). For many companies replacing internal Grids, or even planning for new Grids, cannot be done on the Cloud. There are challenge of integration, moving data, securing data (and this is where Cohesive FT’s VPN-Cubed product offering can help), physical location, legislation, SLA’s and availability (see this article for a good synopsis on this as applied to EC2). Many of these will be resolved in time, and some of course can be resolved right now, but with the move by many vendors to enable existing IT infrastructure and Data Centers as private clouds the point is likely to be mute in the future I think. Right now, an internal Grid is not elastic. It does not add more servers or resources to the service as required, but this will change as such internal Fabric enablers become more normal.In fact one can image a future where such companies may sell excess capacity of their “Grid Clouds” to ensure a more economical running of their infrastructures.