The Beowulf Project
In the summer of 1994 Thomas Sterling and Don Becker, working at CESDIS under the sponsorship of ESS project, built a cluster computer consisting of 16 DX4 processors connected by channel bonded Ethernet. They called their machine Beowulf. The machine was an instant success and their idea of providing COTS base systems to satisfy specific computational requirements quickly spread through NASA and into the academic and research communities. The development effort for this first machine quickly grew into a what we now call the Beowulf Project. Some of the major accomplishment of the Beowulf project will be chronicled below, but a non-technical measure of success is the observation that researcher within the High Performance Computer community are now referring to such machines as "Beowulf Class Cluster Computers." That is, Beowulf clusters are now recognised as a architecture genre.
In the next few paragraphs we will provide some of the history of the Beowulf project and indicate aspects or characteristics at we believe are key to its success.
The Center of Excellence in Space Data and Information Sciences (CESDIS) is a division of the University Space Research Association (USRA) located at the Goddard Space Flight Center in Greenbelt Maryland. CESDIS is a NASA contractor, supported in part by the the Earth and space sciences (ESS) project. The ESS project is a research project within the High Performance Computing and Communications (HPCC) program. One of the goals of the ESS project is to determine the applicability of the massively parallel computer to the problems faced by the Earth and space sciences community. The first Beowulf was built to address problems associated with the large data sets that are often required by these scientist.
It may well be that this is simply the right time in history for the development Beowulf class computers. In the last ten years we have experienced a number of events that have brought together all the pieces for the genesis of the Beowulf project. The creation of the causal computing market (office automation, home computing, games and entertainment) now provides system designer with a new type of cost effective components. The COTS (Commodity off the shelf) industry now provides fully assembled subsystems (microprocessors, motherboards, disks and network interface cards). The pressure of the mass market place has driven the prices down and reliability up for these subsystems. The development of publicly available software; in particular, the Linux operating system, the GNU compilers and programming tools and the MPI and PVM message passing libraries, provide hardware independent software. Thanks to programs like the HPCC program we now have many years of experience working with parallel algorithms. Part of that experience has taught us that obtaining high performance from vendor provided parallel platforms is hard work and has required researcher to adopt a do-it-yourself attitude. A second aspect to this experience is increase reliance on computational science and therefore an increase need for high performance computing. The combination of the these conditions, hardware, software, experience and expectation, may make the development of Beowulf clusters a natural evolutionary event.
We are constantly reminded of the performance improvements in microprocessors, but more important to the Beowulf project is the recent cost performance gains in network technology. Over the last ten years there have been many academic and a significant number of vendors that have built multiprocessor machines based on the then state-of-art microprocessor, but they always required special "glue" chips or one-of-a-kind interconnection schemes. For the academic community this lead to dead ends; the life cycle of such machines strongly correlates to the life cycle of the graduate careers of those working on them. For vendors, these choices were usually made to enhance certain characteristics of their machine, but exploiting these characteristics required programmers to adopt a vendor specific programming model. The Linux support for Ethernet has enabled the construction of a balanced system built entirely of COTS technology; this guarantees forward compatibility of hardware and software.
The first Beowulf was built with DX4 processors and 10Mbit/s Ethernet. The processors were too fast for a single Ethernet and Ethernet switches were too expense. To balance the system Don Becker rewrote his Ethernet drivers for Linux and built a "channel bonded" Ethernet where the network traffic was striped across two or more Ethernets. As 100Mbit/s Ethernet and 100 Mbit/s Ethernet switches have become cost effective, the need for channel bonding has gone away (at least for now). In 1997, a good choice for a balance system was 16, 200MHz P6 processors connected by Fast Ethernet and a Fast Ethernet switch. As the balance between processor speed and network technology changes and as the need for larger and larger clusters increases, the exact network configuration of optimal cluster will also change. However, it has changed in the past and will continue change without effecting the programming model.
Another key component to forward compatibility is the software used on Beowulf. With the maturity and robustness of Linux, GUN software and the "standardisation" of message passing via PVM and MPI, programmers now have a guarantee that the programs that they write will run on future Beowulf clusters---regardless of who makes the processors.
The first Beowulf was built to address a particular computational requirement of the ESS community. It was built by and for researcher with parallel programming experience. Many of these researchers have spent years fighting with vendors and system administrators over the detailed performance information on and control of the particular machine they were using. Often, they were given accounts on large machines but these machines were shared among a large number of user. For these users, building a cluster that they completely understand and are able to completely utilise results in a more effective, higher performance computing platform. An attitude that reveals the hidden costs of running on a vendor supplied computer is "Why buy what you can afford to make?" The view here is that learning to built and run a Beowulf cluster is an investment; learning the peculiarities of a specific vendor only enslaves you to that vendor.
These hard core parallel programmers are first and foremost interested in high performance computing applied to difficult problems. At Supercomputing '96 both NASA and DOE demonstrated clusters costing less than $50,000 that achieved greater than a Gflops sustained performance. A year later, NASA researchers at Goddard Space Flight Center combined two clusters for a total of 199, P6 processors and ran a PVM version of a PPM (Piece-wise Parabolic Method) code at a sustain rate of 10.1 Gflops. In the same week (in fact, on the floor of Supercomputing '97) Caltech's 140 node cluster ran an N-body problem at a rate of 10.9 Gflops.
Beyond the seasoned parallel programmer, Beowulf clusters have been built and used by programmer with little or no parallel programming experience. In fact, Beowulf clusters provide universities, often with limited resources, an excellent platform to teach parallel programming courses and provide cost effective computing to their computational scientists as well. In this case, the start up cost is probably minimal for the usual reasons. Most students interested in such a project are likely to be running Linux on the own computers and learning to do parallel computing is all part of the course.
In the taxonomy of parallel computers, Beowulf clusters fall somewhere between MPP (Massively Parallel Processors, like the Convex SPP, Cray T3D, Cray T3E, CM5, etc.) and NOWs (Networks of Workstations). The Beowulf project benefits from developments in both these classes of architecture. MPP's are typically larger and have a lower latency interconnect network than an Beowulf clusters. In spite of vendor claims, programmers are still required to worry about locality, load balancing, granularity, and communication overheads. Most programmers develop their programs in message passing style. Such programs can be readily ported to Beowulf clusters. Programming a NOW is usually an attempt to harvest unused cycles on an already installed base of workstations in a lab or on a campus. Programming in this environment requires algorithms that are extremely tolerant of load balancing problems and large communication latency. These programs will directly run on a Beowulf.
A Beowulf class cluster computer is distinguished from a Network of Workstations by several subtle but significant characteristics. First, the nodes in the cluster are dedicated to the cluster. This helps ease the load balancing problem, because the performance of individual nodes are not subject to external factors. Also, since the interconnection network is isolated from the external network, the network load is determined only by the application being run on the cluster. This eases the problems associated with unpredictable latency in NOWs. All the nodes in the cluster are within the administrative jurisdiction of the cluster. For example, the Beowulf software provide a global process ID which enables a mechanism for a process to send signals to a process on another node of the system. This is not allowed on a NOWs. Finally, operating system parameters can be tuned to improve performance. For example, a workstation should be tuned to provide the interactive feel (instantaneous responses, short buffers, etc), but in cluster the node can be tuned to provide better throughput for coarser grain jobs because they are not interacting with users.
The Beowulf Project grew from the first Beowulf machine and likewise the Beowulf community has grown from the NASA project. Like the Linux community, the Beowulf community is a loosely organised confederation of researcher and developer. Each organisation has its own agenda and its own set of reason for developing a particular components of the Beowulf system. As a result, Beowulf class cluster computers range from several node clusters to several hundred node cluster and are being applied in all areas of computational sciences. The future of the Beowulf project will be determined collectively by these organisations and by the future mass market COTS. As microprocessor technology continues to evolve and higher speed networks become cost effective and as more application developers move to parallel programs, the Beowulf project will evolve to fill its niche.