This is a working paper identifying various technologies, and providing a set of requirements for a research problem. When dealing with Beowulf clusters and other scientific type processing clusters one problem is that they require extensive programming of applications to use them using the message passing interface language or parallel virtual machine language. What is needed is a utility processing infrastructure that resource nodes with minimal operating systems can fill that commodity node. What is also needed is that the load distribution across that commodity computing cluster can have a hypervisor or administrator layer that allows for applications to be added without much in the way of issues. The processing power of the cluster is then available to that one application as needed.
This unfortunately leads to an issue of the concept of identifying the processing infrastructure. There are few varieties of cluster computing that exist. The first common concept is that of Beowulf clusters usually running on the Linux operating system. These though require the applications to be built specifically to run on that environment. The high performance the Beowulf clusters are not flexible in implementation, or at least not flexible enough for our needs. In other words to unlock the power of the cluster requires a new application development effort each time or the use of programs that are flexible enough but usually domain specific.
The next common idea of clustering is high availability and once again messages are passed back and forth through a memory structure and message interface. This is fairly common from applications like Google to multi-homing and other techniques. What usually happens is that memory pointers exist that point at actual information and a rendering agent like a web server pulls that information together. We can find this type of clustering in a variety of hard coded applications. Many database engines including MySql have the ability to be clustered in this manner.
The last cluster technique we will discuss is slightly different and what it does is aggregate processor resources into one large machine that appears as a larger leviathan. Microsoft cluster services are one example of this technology. The virtual server technology aggregates processing and disk resources. Unfortunately the heavy payload of the host operating system radically reduces any performance enhancement.
Unfortunately none of the previously discussed clustering solutions will work for the idea of utility processing infrastructure needed to answer a specific need. The closest solution is grid computing as found in the Sun Grid Engine. The Sun implementation of the grid engine is extremely limited and looks much like the Beowulf implementation with all of the associated issues of software development. Parallel computing also answers some of questions but doesn’t answer the actual specific need. VMware has two solutions. The light weight consumer version virtual server and the enterprise solution ESX Server both have promise to answer the required need. The issue being that VMWare can slice up a physical server but is not able to cluster with others to create a substantive solution. The VMware high availability server solution utilizing ESX does not appear to truly allow for full resource sharing and stovepipes within one server. The VMWare infrastructure solution is closer, but still ties the operating systems of virtual machines to a physical server.
What is needed to define it further is a solution that allows for a host with a minimal operating system to be added to a cluster or group of computers. Some kind of hypervisor is needed to control the virtual hosts, and something as a container to run the applications and virtual machines within (see Figure 1).
Figure 1: The Utility Processing Infrastructure
Those who have not thought the issue out will be crying “Cloud computing is the answer!”, but there are still a few issues. Though the APIs exposed by Google and other cloud computing entities could be used to provide processing power the ability to run many operating systems or large applications will rapidly expose some of the issues with cloud computing. Control of the systems running in the cloud, throughput from the network, security of the data in transit, and good citizen needs of cloud providers are all going to be problems.
Cloud computing and grid computing have substantial similarities and it may be that the differences are the enabling technologies for the full system. What is proposed is nearly the holy grail of aggregate processing techniques. A lot of techniques come close but none of the vendor solutions appear to answer all the questions.
To reiterate the requirements a little more deeply.
- A set of Linux computers of “any” number with a minimal operating system to provide commodity-processing power. The machine specifications should be any heterogeneous hardware able to load the clustering software and operating system.
- A management application interface such as a hypervisor to give access to the commodity processing power below it. The hypervisor is a common way of describing the application interface for virtual machines and applications. Within this design document the verbiage is continued for the sake of brevity. Simply put in most cluster/grid/cloud computing scenarios this is the place most of the work gets done.
- An installable container that virtual machines and applications can be installed into is important. That container should be as much a ram disk with as much of the work happening at the speed of RAM versus waiting for disk swaps to occur. This will make the system perform much faster.
The system should allow, for example, a single virtual machine to access all of the resources of the commodity-computing cluster below it. The standard method is to do this through a shared memory space that may be written to a ram disk or hard disk mount point. Then telemetry and processing jobs are split among the different cluster machines.
There are of course a huge number of questions left unanswered.
1 comment for “Utility processing infrastructure”