I am happy to announce the TP 1.3.2 release — the “cloudkit release” of the Workspace Service. You can download the new release from: http://workspace.globus.org/downloads/index.htmlAs many of you have probably noticed we have recently been sending announcements about the availability of compute clouds for scientific communities: http://workspace.globus.org/clouds/
In a nutshell, TP 1.3.2 allows you to build your own cloud. The main addition is a new “cloud client” for the workspace service which simplifies (and also hides) much of the workspace functionality to provide an EC2-like set of features. The new client also provides a limited form of “contextualization” (more coming in the next release!). We also provide a step-by-step “cloud guide” that allows you to configure your own cloud.
For a complete set of new features (many more but less significant) look to:
http://workspace.globus.org/vm/TP1.3.2/index.html#changelogWe look forward to hearing from you — and if you do decide to configure a cloud and would like help finding users, please do let us know.
Have fun!
The Workspace Team
–Kate Keahey,
Mathematics & CS Division, Argonne National Laboratory
Computation Institute, University of Chicago
Entries in 'workspaces'
Workspace Service TP1.3.2
Stratus Cloud at the University of Florida
From workspace-announce:
I am happy to announce the availability of a science cloud (codenamed “Stratus” ;-) at the University of Florida. This cloud introduces a new feature: the use of virtual networks with virtual machines for cloud computing.
The cloud is available for members of the scientific community: to obtain access you will need to provide a justification (a few sentences explaining your science project) to cloud administrators at UFL. To find out more go to:
http://workspace.globus.org/clouds/
The cloud is currently deployed on a modest allocation of resources as a beta project. We welcome comments, feedback, and bug reports.
Workspace Service TP1.3.2 release candidate 0
Tutorial: Virtualization and Cloud Computing with Globus
Virtual Workspaces Tutorial at Open Source Grid Cluster (May 12-16, 2008)
There will be a Virtual Workspaces tutorial at the Open Source Grid Cluster conference in Oakland, CA. The conference is May 12-16, 2008. The Virtualization and Cloud Computing with Globus session is on Wednesday, May 14th, from 4:30-6:00 pm. We hope to see you there!
Quoting from the summary:
One of the primary obstacles users face in grid computing is that Grids provide access to many diverse resources, their applications often require a very specific, customized environment. This disconnect can lead to resource underutilization, user frustration, and much wasted effort spent on bridging the gap between applications and resources. Virtual Workspaces describe the environment required for the execution of an application that can be dynamically deployed across a variety of resources creating a working and consistent platform for grid applications.
This tutorial will introduce the Globus Toolkit workspace service that implements workspaces as Xen virtual machines and enables authorized grid clients to dynamically deploy them and manage their resources. Further, we will describe and demonstrate the workspace “cloudkit” that provides a user-friendly interface on top of the workspace service allowing authorized users to easily provision and run VMs on the available community clouds. Finally, we will describe how the process of contextualization can be used to provide on-demand functioning clusters and give examples of its use by applications.
Cloud lock-in is not such a big deal
There’s been a lot of talk about the dangers of getting locked in to cloud platforms, developing an application that is only suited to one platform.
Here’s a, let’s say… “embellished” example: Gangsta cloud wars could pivot on the traffic-driving power of Google and Microsoft/Yahoo.
When you’re using VMs like Xen (e.g. on EC2), if you design things for it you “should be able to” move without a ton of hassle (research. plan.). The workspace project has been working on portability and usability (see The first one-click STAR production cluster) and one of the things we can do now is use the same VM image on a regular cluster (such as on the Teraport cloud) and EC2. The contextualization software can be configured to sense if it is on EC2 or not (and will bootstrap accordingly). It “would be nice” if such things were standardized but this is not a real problem right now (IMHO).
About something more “strongly typed” like Google’s AppEngine. Application migration might be a bit harder, but not if the APIs are well known and repeatable. Google’s SDK is even Apache 2 licensed.
To that point, have a look at Announcing AppDrop.com (host Google App Engine projects on EC2). It’s not there yet (database is a flat file) but, hey, it was developed in a few days. Cool. Read more at http://appdrop.com.
The long term idea is not that this would solve all your problems magically but that such things are possible, and if there’s a real market for choices, it seems like more work on things of this nature are also inevitable.
I’m no datacenter business expert, but the biggest problem right now seems to be that few people will be able to compete on costs/efficiencies of scale with Google/Amazon/Microsoft/eBay. (<predictions…>) It feels like it would naturally approach the straight web hosting business, though. Let’s say a standard, open source cloud computing infrastructure emerges (such as Apache httpd in the analogy). There will be various levels of players as far as the capital they have and certainly better and worse companies to choose from (including those that differentiate on service etc). But if you’re really sweating the savings an enormous company could provide with such efficiencies vs. a normal size company/datacenter, you’re probably at the point where you could save a whole lot more by buying your own computers.(</predictions…>)
Miscellaneous point about lock-in: something user-facing that ties you to a provider does not seem like a wise idea (e.g. Google’s Users API).
Nimbus: The University of Chicago Science Cloud
If you’re on the workspace-announce list, you will have already seen the “Science Cloud Available at the University of Chicago” email.
Built with the workspace service, we’ve made some nice client enhancements to get to “cloud simplicity” and it’s up and running on 16 nodes and already serving guests. See the the documentation for command samples, the idea is to make it as simple as possible. On the service side, Nimbus uses TP1.3.1 with some very small additions (mostly this differs because of a new authorization plugin). Building cloud computing solutions is the main business of the workspace service.
Have a look!
[UPDATE: using TP1.3.3.1 now which enables one-click clusters]
Workspace Service TP1.3.1
Some cool new features:
On behalf of the workspace team, I am happy to announce the TP 1.3.1 release of the Workspace Service. You can download the new release from: http://workspace.globus.org/downloads/index.html
The main new feature in this release is the implementation of the workspace pilot which provides non-invasive adaptations to batch schedulers (such as PBS) enabling sites to run virtual machines alongside jobs. The details of this approach are described in: http://workspace.globus.org/papers/workspace-pilot-paper-submitted.pdf
In addition, the release also contains the ensemble service that allows clients to create ensembles of heterogeneous virtual machines to be deployed and managed together, improvements to the client, and several bug fixes. The complete changelog can be found at: http://workspace.globus.org/vm/TP1.3.1/index.html#changelog
We welcome comments, feedback, and bug reports. Information about the project, software downloads, documentation and instructions on how to join the workspace-user mailing list for support questions can be found at: http://workspace.globus.org
Happy Valentine’s Day!
As you can read there, the main new feature is the pilot infrastructure. The paper Kate refers to in the announcement is a relatively short read and lays out the ideas (and a practical evaluation) in an organized way. But briefy: the pilot is a program the service will submit to a local site resource manager in order to obtain time on the VMM nodes. When not allocated to the workspace service, these nodes will be used for jobs as normal. Those jobs run in normal system accounts in Xen domain 0 with no guest VMs running.
Importantly, the approach leaves the site resource manager in full control of the nodes and requires no modifications to the site resource manager. Save perhaps possible configuration changes you might like to make. For example, you can mark particular nodes as able to accomodate guest VMs: the workspace service supports sending pilot requests to particular LRM queues, or providing a particular node property etc. This allows you to really organize not just when but where VMs can run.
Several extra safeguards have been added to make sure the node is returned from VM hosting mode at the proper time, including support for:
- the workspace service being down or malfunctioning
- LRM preemption (including deliberate LRM job cancellation)
- node reboot/shutdown
Also included is a one-command “kill 9″ facility for administrators as a “worst case scenario” contingency.
So as a buzzword experiment, I want to put in a particular keyword here and see how the search engine hits work out :-). I think you know what it may be…
Cloud computing
Go make a cloud!
And with the workspace pilot, you won’t have to switch over all at once. Take it for a test run and tell us about it on workspace-user.
We’ve got some exciting stuff in the pipeline for the next few months, too (see the last release announcement and the self-configuring 100 node VM cluster news). I am really happy with where the project is going and has been recently.
- Tim
Virtual Cluster Appliances
This Better Know a VM entry, Virtual Cluster Appliances, gives an overview of VM contextualization technology which is scheduled to be part of the next workspace service release. This is not just relevant to classic grid computing, but any situation where you’d like to automatically launch many virtual machines that work together and want them to securely organize themselves and adapt to the deployment environment. It can even be used for one VM, we’ll look at such cases later.

