Entries in 'vm general'

Next Page »

Cloud computing discussion group

http://groups.google.com/group/cloud-computing

Virtualization Girl

Oh dear…

http://www.virtualizationgirl.com

The Gradual Shift

More about cloud computing…

Here’s a short quote from Billy Marshall (rPath): A Big Switch or a Gradual Shift?

The historical metaphor that Carr effectively uses to demonstrate the likelihood of this pending change is the switch from locally produced electrical power to regionally produced electrical power delivered via a high performing electrical grid infrastructure. In Carr’s metaphor electricity is analogous to applications and the electrical grid is analogous to the Internet. There are clearly some parallels, but I believe the metaphor is flawed because information applications are more analogous to hair dryers, drill presses, and die stamping machines (i.e. applications that consume electricity) as opposed to the electricity itself.

Billy goes on to point out how companies are always going to need specific things from these electricity-consuming objects, hypervisors are more like the power transformers that convert and reliably step down electricity (into standardized, repeatable delivery units), and virtual appliances are more like the hair dryers, drill presses, and die stamping machines:

When applications can reliably plug into a grid to receive “power” in a standardized and repeatable manner, it will be increasingly popular to let someone else deliver the power of the grid while the individual companies focus on the “design of the application” (i.e. the drill press, the chip digester, the ore smelter).

I think it’s a good way to frame things, an expansion I’d offer is that it is not just hypervisors that are this transformation/delivery mechanism, but also all of the other cluster infrastructure needed to make a leasable datacenter. The security, scheduling, efficiency, and enforcement mechanisms/policies that must be in effect. The hypervisor is in all likelihood going to be the most popular core technology, but there’s a lot more to making a safe, solvent, and usefully leasable cluster.

At the edge of the cluster and beyond, there’s also all the technology and lessons of grid computing to draw from. A field where virtualization is a mechanism being incorporated in a larger pre-established context (cf. papers from our group and many others). In the analogy, facets of grid computing perhaps get us into “buying clubs”, “electricity markets”, “consumer protection”, etc. (and how about rolling blackouts).

Grid Gurus

Rich Wellner started a new multi-author grid blog called Grid Gurus. There are already some pretty interesting things online there. And there is a just fascinating article posted today: Better Know a VM: Part 1 of 435.

:-)

Hyperjacking

The Blue Pill/Subvirt approach (I addressed it earlier) has a new name apparently: hyperjacking [google query].

HVM laptop list

I stumbled upon a Xen wiki page, HVM_Compatible_Notebooks, which is a good complement to HVM_Compatible_Processors.

These pages are going to be important to me next year when I look at getting a new laptop :-)

Are those all that are out there so far?

DRBD, LVM, GNBD, and Xen for free and reliable SAN

At home, I wanted a reliable disk solution for backups and also wanted a big, blank and resizable storage system for virtual machines. I knew I wanted to be able to get at the shared disk remotely from other nodes and wanted to be able to replace broken hardware quickly if something failed. I also didn’t want to spend a lot of time reconfiguring OSs and software in the case of a total system failure.

I have two cheap computers and so I put some big disks in them and mirrored the disks over the network. Instead of using one file server node and RAID1, this is something like a “whole system RAID”. If anything at all breaks in either computer, hosted services can keep running and data is unharmed except for whatever was unsynced in RAM.

To accomplish the disk mirror I used DRBD. DRBD is a special block device that is designed for highly available clusters, it mirrors activity directly at the block device level across the network to another disk. So like a RAID1 configuration over the network. It lets you build something like the shared storage devices on a SAN, but without any special hardware. This provides the basic reliability layer.

diagram: two hosts mirrored with DRBD over crossover cable

Linux Logical Volume Management (LVM) is a popular tool that lets you flexibly manage disk space. Instead of just partitioning the disk, using LVM lets us do on-the-fly logical partition resizing, snapshots (including hosting snapshot+diffs), and adding more physical disks into the volume group as needs grow (you can even resize a logical partition across multiple underlying disks). Each logical partition is formatted with a filesystem of its own. Using LVM avoided some future headaches I think.

That is how the disk is setup, now how to access it remotely? You could run a shared filesystem of course, exporting via an NFS server on host A (or B). Instead, having heard good things about Global Network Block Device (GNBD) on the Xen mailing lists, I chose to export the logical block devices (from LVM) directly over the network with GNBD. Another node makes a GNBD import and the block device appears to be a local block device there, ready to mount into the file hierarchy. This is like iSCSI but it is a snap to set up and use.

And if that other node is a Xen domain 0, that block device is very handily ready to be used as a VM image, just as if it was a raw partition on that node.

diagram: one of the nodes of the disk array exporting an LVM partition over GNBD to a Xen dom0

Here’s an example Xen configuration using the imported block device:

disk = [ ‘phy:/dev/gnbd/vmimage001,sda1,w’]

The guest VM needs no awareness of all these tools, it just sees its sda1 and mounts it like anything else Xen presents to it as one of its “hardware” partitions.

Instead of just using the file store for backups and VMs that are used intermittently, I’m also running persistent services like websites, the incremental-backup server and a media server in VMs stored there.

First, this allows for basic backups of the LAN services without any backup software, that’s nice to have, although I really prefer a combination of incremental backups and RAID1. (Here we also avoid a Russell’s paradox situation with the backup server).

Second, keeping time-consuming-to-configure services in a VM allows me to replace hardware quickly, including whole computers in the event of a total failure: the only software I’ll ever need to reinstall is {Linux, DRBD, LVM, GNBD} for a file server node and {Linux, Xen} for a VMM node.

As long as net latency is really low (here it is sub-ms) it doesn’t really matter that the disk is remote for any of my uses. The VMs always respond very well.

(I should mention: you could of course take GNBD out of the picture and run the VMs on host A and B if Xen were installed there)

Another bonus: using GNBD, you can live-migrate the VM to any node that can do a GNBD import. This is nice to have. I only live migrate manually, though. Both DRBD and GNBD have some features that allow for seamless failover but I don’t really need any of this at home.

To learn more about that, check out this paper on the new DRBD (it is interesting): http://www.drbd.org/fileadmin/drbd/publications/drbd8_wpnr.pdf

Thinking about high availability in this kind of setup for a minute, a possible and simple to execute arrangement for services that need to be up at all times would be to take two DRBD mirrored nodes, run VMs on one or both of them, and have the physical nodes heartbeat each other. This is a simpler approach than a centralized file server with block device export, here we just have two peer VMMs that are “watching out” for each other.

You’d have two master/slave arrangements, so in the normal operating case: one VMM with partition A as DRBD master and partition B as secondary, then on the other VMM you have partition A as secondary and partition B as master. VMs run from a partition that is the DRBD master.

Let’s say you split four services into four VMs and put two VMs on each physical node. One of the physical node’s disks fail entirely and a monitor process notices. The heartbeat script makes sure the OK node is now the DRBD master for both partition A and B. Then it boots the two VMs previously hosted on the failed node on the OK node, re-allocating RAM for the time being to accomodate all four VMs.

diagram: 2 VMs migrate to the OK node

The applications in those VMs recover just as if they went through a normal hard system reset (their network addresses can stay the same since both physical nodes are on the same LAN). Once the administrator gets the alert email and puts a new disk in, another script is ready to resync DRBD and then migrate two of the VMs back to their normal place.

This seems like something to consider for a highly hammered and important head node (like a Globus GRAM node for example). All it takes is another node, commodity hardware and open source software!

Virtualization and hardware sales

I noticed this blog entry on ZDNet, Virtualization begins to haunt hardware makers, that guesses what you might expect: the current wave of companies consolidating via virtualization is what’s been eating into hardware sales.

Yet Jonathan Schwartz (CEO of Sun) recently took the position that:

if you can double server utilization via Solaris Containers or VMWare [sic], people don’t buy fewer computers - they buy more. The value of innovation, at least to our core customers, is growing so fast that if the price declines, the overall return (value/price) goes through the roof - encouraging a feedback loop.

Could it be that Schwartz is still right but we’ll just first see a noticeable dip (”correction”) this year as most people merely consolidate instead of buying more hardware?

Open source VMs (and their accompanying management infrastructure) are eventually going to be just another standard part of the datacenter toolbox. It seems that value/price would stabilize and people would get used to it; this “feedback loop” can’t feed off itself forever.

Another major shift looks possible: the datacenter could see big power savings in the next decade. The this 80 core processor that Ian Foster pointed out “uses less energy than a quad-core processor and has teraflop performance capabilities.” The ability to sleep and awake cores as demand changes (and the ability to migrate computations away from cores that are getting too hot) results in huge power savings. Sounds like a mini virtual cluster on a chip.


Next Page »