Cloud computing: An open source networked operating system

Cloud computing is a developing section of high performance computing.  High performance computing includes many facets such as availability, processing, and reliability.  In this project the facet of reliability is examined.  Increasing the number of machines is one way to increase reliability.  Introducing a load balancer to split the requests between computers and monitor computer health status creates an easy way to increase reliability.

Open source products were chosen for two main reasons: price and customizability.  The data that was collected in this project is meant to be able to be replicated by anyone, so by using free software equal access to the necessary tools is insured.  Using customizable software ensured that if any problems were encountered they could be readily fixed with the knowledge and skills that the team possesses.

EyeOS was chosen over other solutions because it’s lightweight nature and ease of setup.  Other solutions inspected included Citrix Xen and Eucalyptus.  Citrix Xen was not open source, so it was immediately excluded from the list.  Eucalyptus, on the other hand, was free and easy to set up since it is now included in an Ubuntu distribution. Unfortunately it also had considerably more overhead then EyeOS.  In the end it was discovered that EyeOS met the requirements and included community documentation that is immensely helpful should anything go wrong.

A network architecture that involved using two separate networks was created.   One network was designed to be a storage portion and the other to distribute incoming traffic throughout the cloud.  Separating the networks reduced the amount of traffic on each network so that the system would scale better. Having the networks separate also has some security advantages.  IP tables could be set considerably more restrictive and because the network is physically separate it reduces the number of attacks that are possible. If the load balancer were to be compromised the data on the storage machine would not be accessible.

To address security concerns existing technologies such as IP tables and Bastille were used.  IP tables provide an existing solution with a very positive track record.  Creating IP table entries ensured that the individual computers were not able to communicate with or pass traffic to anything that they should not need to access.  Limiting the number of sources that a computer can communicate with helps prevent possible attacks and at the very least makes the machine less usable to an attacker if it were to be compromised.  Bastille was used to harden the Debian Lenny operating system that the apache servers for EyeOS run on.  Bastille shuts off unnecessary processes and removes unnecessary files.   This also increases security and works as a performance-tuning step by reducing overhead.

There were a few issues found with the project.  The first issue discovered was the quality of the hardware was poor.  Many of the computers had hardware malfunctions including dead hard drives, bad RAM and motherboard issues.  Many computers had to be removed from the project because the proper parts to fix them could not be found or salvaged from other non-working machines.  The second issue discovered was the lack of physical access to the hardware.  The hardware was located in a tight closet sandwiched on a rack between the walls and the door.  This made accessing the hardware difficult if many people needed to access it at once.  Another issue was a certificate could not be obtained.  Purdue denied our request because of “security reasons”.  Additional issues were encountered that hindered the performance and capabilities of the cluster. Although there is a load balancer to handle potentially high amounts of traffic, there is only one used to balance all traffic.  This was slightly inefficient, and potentially if it were to experience downtime, traffic to the cluster would be stopped entirely.

The project has several real world applications.  A primary use would be for office environments.  First, each user would have a terminal rather than their own personal computer. This allows the use of machines with lower specs rather than the latest and greatest machines to access the cloud.  Not only does this help cut corporate cost, but also it extends the life of some machines past the three-year mark.  Next each office would have only one large system to manage, rather than hundreds of machines separated in several buildings.  This setup would allow faster IT response and shorter system downtime.  It would also include the benefit of centralized storage onsite, allowing for both higher physical and electronic security measures rather than lower security on several scattered machines.

This project would benefit greatly from several improvements that unfortunately time constraints or fiscal matters would not allow to be included in the project.   One major improvement would be to have more and higher quality machines included in the cluster. These machines would dramatically increase system performance and reliability. Another possible improvement would include more storage on the storage server. As it stands now each user is limited to a total of one megabyte of storage on EyeOS.  It is believed this is enough to allow the user to experience EyeOS, but not tax the system. The final suggested improvement would include more security. Encryption was not implemented inside of the network because it would slow down response time and further tax the computers.   The general consensus was that since the EyeOS is segregated the need was not great enough to include encryption on the network.

The system is free to use until it breaks or February 1, 2010 whichever comes first. There is no encryption or expectation of privacy on this system (hint: it’s managed by students).

Leave a Reply