Entries in 'amazon'

Next Page »

EUCALYPTUS 1.0

EUCALYPTUS - Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems - is an open-source software infrastructure for implementing “cloud computing” on clusters. The current interface to EUCALYPTUS is compatible with Amazon’s EC2 interface, but the infrastructure is designed to support multiple client-side interfaces.

May 14th, 2008: EUCALYPTUS is publically demonstrated at the Open Source Grid and Cluster conference.

May 29th, 2008: Version 1.0 is released as a feature-limited binary-only beta.

http://eucalyptus.cs.ucsb.edu/

Cloud lock-in is not such a big deal

There’s been a lot of talk about the dangers of getting locked in to cloud platforms, developing an application that is only suited to one platform.

Here’s a, let’s say… “embellished” example: Gangsta cloud wars could pivot on the traffic-driving power of Google and Microsoft/Yahoo.

When you’re using VMs like Xen (e.g. on EC2), if you design things for it you “should be able to” move without a ton of hassle (research. plan.). The workspace project has been working on portability and usability (see The first one-click STAR production cluster) and one of the things we can do now is use the same VM image on a regular cluster (such as on the Teraport cloud) and EC2. The contextualization software can be configured to sense if it is on EC2 or not (and will bootstrap accordingly). It “would be nice” if such things were standardized but this is not a real problem right now (IMHO).

About something more “strongly typed” like Google’s AppEngine. Application migration might be a bit harder, but not if the APIs are well known and repeatable. Google’s SDK is even Apache 2 licensed.

To that point, have a look at Announcing AppDrop.com (host Google App Engine projects on EC2). It’s not there yet (database is a flat file) but, hey, it was developed in a few days. Cool. Read more at http://appdrop.com.

The long term idea is not that this would solve all your problems magically but that such things are possible, and if there’s a real market for choices, it seems like more work on things of this nature are also inevitable.

I’m no datacenter business expert, but the biggest problem right now seems to be that few people will be able to compete on costs/efficiencies of scale with Google/Amazon/Microsoft/eBay. (<predictions…>) It feels like it would naturally approach the straight web hosting business, though. Let’s say a standard, open source cloud computing infrastructure emerges (such as Apache httpd in the analogy). There will be various levels of players as far as the capital they have and certainly better and worse companies to choose from (including those that differentiate on service etc). But if you’re really sweating the savings an enormous company could provide with such efficiencies vs. a normal size company/datacenter, you’re probably at the point where you could save a whole lot more by buying your own computers.(</predictions…>)

Miscellaneous point about lock-in: something user-facing that ties you to a provider does not seem like a wise idea (e.g. Google’s Users API).

Amazon EC2 persistent storage

EC2 announced future support for adding raw, persistent block devices to VMs, a few non-Amazon people are even testing it already.

See Werner Vogels and this RightScale post.

One dollar for a million SQS operations

Amazon SQS is a distributed message queue system with a simple, robust API and real infrastructure to back it. And their prices just dropped significantly from a penny per 100 requests to a penny per 10,000:

Dear Amazon SQS Developers,

We wanted to let you know about some changes we are making to Amazon SQS, based on customer feedback and watching the way customers are using the service. One thing we’ve heard consistently is that customers want to be able to use SQS along with our other services (e.g. Amazon EC2, Amazon S3), but need SQS to be less expensive for this to be more feasible. We looked at our architecture and feature set, and found a way to make a few, targeted changes, by deprecating a few infrequently used requests, which allow us to operate the service much more efficiently. Simultaneously, we are introducing a new pricing structure that replaces the previous per-messages-sent charge ($0.10/1,000 messages) with a new per-request fee ($0.01/10,000 requests, including all Amazon SQS operations). The net result is that the new pricing will result in significantly lower charges for most developers being billed for SQS.

I’m hoping we’ll look back in five years and reminisce about how they charged so much for EC2 as well :-) (I do think it’s a good price now unless you are looking to continually use many, many computers).

EC2 has more instance types now

Instead of a single allocation, EC2 announced you can run several different kinds of instances.

See the EC2 home page for details:

$0.10 - Small Instance (Default)

1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit), 160 GB of instance storage, 32-bit platform

$0.40 - Large Instance

7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB of instance storage, 64-bit platform

$0.80 - Extra Large Instance

15 GB of memory, 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform

In many cases it may be more cost effective to still get the small instance but just get a lot of them, this will be interesting for our workspace EC2 adapter and contextualization users (and us!). Once we make the small alterations to accomodate requesting these types, it will be just as easy to get 100 x small instance as 25 x large instance, or whatever combination, because deployment configurations can be coordinated on the fly. What would be best for what situation would have to be examined closely. An extra large instance for the virtual cluster head node(s) or storage/transfer node(s) could be extremely useful for the typical grid-cluster bottlenecks.

The first one-click STAR production cluster

Quoting from workspace news:

The STAR community successfully completed its first production-size deployment of a VM-based virtual cluster managed by the workspace service and backed by EC2 resources.

The 100 node cluster was composed of a headnode and workernodes based on the OSG 0.6.0 grid middleware stack and Torque. Its deployment-time configuration was securely coordinated by the new workspace contextualization technology.

[UPDATE, related: http://www.gridvm.org/virtual-cluster-appliances.html]

Workspace EC2 integration; Contextualization

It’s been busy lately, attended the first dev.Globus All Hands Meeting and TeraGrid ‘07 right here in Madison.

At TG07, Kate gave a talk which is online. The paper she presented discusses among other things contextualization, the structure and mechanisms by which an appliance/workspace is “told” what it needs in order to adapt to its deployed environment. This is not just adaptation to site specific services but also to other appliances that may be deployed with it such as in a virtual cluster deployment.

Amidst the bustle we implemented a new backend to the Workspace Service, to Amazon’s Elastic Compute Cloud (EC2). We’ve deployed it to the University of Chicago’s Teraport cluster and will currently pay for usage by selected collaborators.

Besides being somewhat fun to implement (including getting the Globus and Amazon Secure Message stacks on the same wavelength), I think it’s going to be interesting.

Because grid resources are cautiously approaching the pioneering switch to virtualizing resources [1], even in part, it is going to be interesting and educational to see what people will be able to accomplish with workspaces when a large pool of resources is actually available on tap — today.

Because the same deployment protocols can be used for both native and EC2 resources, there are of course capacity overflow use cases. In the right situations, VMs are a good mechanism for providers to dynamically reach more consumers as the need arises.

For a feature list and description, see What is the EC2 backend?

——-

[1] and some would say inevitable switch, even with the performance costs. Consider also that ‘virtualizing resources’ may mean physical node re-imaging, cf. Virtual Workspaces: Achieving Quality of Service and Quality of Life in the Grid.

Utility computing without VMs “considered harmful”?

Previously, in S3 re-pricing commentary, I wrote about the good news that Amazon’s EC2 service was hitting capacity limits.

Sun has built it, but will they come? talks about Sun’s lackluster sales with its utility computing effort.

I’m wondering why there is this disparity. In my opinion, there are two major differences between Sun and Amazon’s offerings:

  1. With Sun’s offering you need to port your program to Solaris.
  2. Sun’s costs a dollar an hour, Amazon’s costs 10 cents an hour.

I think the porting problem is a much bigger limitation and this bodes well for the workspace concept in grid computing. There is a similar problem with the big grids in that they usually expect scientists to port their code to a homogenous platform — this is sometimes a near-impossible proposition.


Next Page »