EOPS&T_-_Computer_Hardware.pdf

Cluster Computing

Thomas Sterling

California Institute of Technology

and NASA Jet Propulsion Laboratory

I. Introduction

II. A Taxonomy of Cluster Computing

III. A Brief History of Cluster Computing

IV. Cluster Hardware Components

V. Cluster Software Components

VI. Summary and Conclusions

GLOSSARY

CLUSTER COMPUTING is a class of parallel computer

structure that relies on cooperative ensembles of inde-

pendent computers integrated by means of interconnec-

tion networks to provide a coordinated system capable of

processing a single workload. Cluster computing systems

achieve high performance through the simultaneous ap-

plication of multiple computers within the ensemble to a

given task, processing the task in a fraction of the time

it would ordinarily take a single computer to perform the

same work. Cluster computing represents the most rapidly

growing ﬁeld within the domain of parallel computing

due to its property of exceptional performance/price. Un-

like other parallel computer system architectures, the core

computing elements, referred to as nodes , are not custom

designed for high performance and parallel processing but

are derived from systems developed for the industrial,

commercial, or commodity market sectors and applica-

tions. Beneﬁting from the superior cost effectiveness of

the mass production and distribution of their COTS (com-

mercial off-the-shelf ) computing nodes, cluster systems

exhibit order-of-magnitude cost advantage with respect to

Cluster A computing system comprising an ensemble of

separate computers (e.g., servers, workstations) inte-

grated by means of an interconnection network co-

operating in the coordinated execution of a shared

workload.

Commodity cluster A cluster consisting of computer

nodes and network components that are readily avail-

able COTS (commercial off-the-shelf) systems and that

contain no special-purpose components unique to the

system or a given vendor product.

Beowulf-class system A commodity cluster imple-

mented using mass-market PCs and COTS network

technology for low-cost parallel computing.

Constellation A cluster for which there are fewer SMP

nodes than there are processors per node.

Message passing A model and methodology of paral-

lel processing that organizes a computation in separate

concurrent and cooperating tasks coordinated by means

of the exchange of data packets.

Cluster Computing

their custom-designed parallel computer counterparts de-

livering the same sustained performance for a wide range

of (but not all) computing tasks.

performance computing, because advances in device tech-

nology are usually ﬁrst incorporated in mass market com-

puters suitable for clustering.

•

High availability. Clusters provide multiple redun-

dant identical resources that, if managed correctly, can

provide continued system operation through graceful

degradation even as individual components fail.

I. INTRODUCTION

Cluster computing provides a number of advantages with

respect to conventional custom-made parallel comput-

ers for achieving performance greater than that typical

of uniprocessors. As a consequence, the emergence of

clusters has greatly extended the availability of high-

performance processing to a much broader community

and advanced its impact through new opportunities in sci-

ence, technology, industry, medical, commercial, ﬁnance,

defense, and education among other sectors of computa-

tional application. Included among the most signiﬁcant ad-

vantages exhibited by cluster computing are the following:

Cluster computing systems are comprised of a hierarchy

of hardware and software component subsystems. Cluster

hardware is the ensemble of compute nodes responsible

for performing the workload processing and the commu-

nications network interconnecting the nodes. The support

software includes programming tools and system resource

management tools. Clusters can be employed in a number

of ways. The master–slave methodology employs a num-

ber of slaved compute nodes to perform separate tasks

or transactions as directed by one or more master nodes.

Many workloads in the commercial sector are of this form.

But each task is essentially independent, and while the

cluster does achieve enhanced throughput over a single

processor system, there is no coordination among slave

nodes, except perhaps in their access of shared secondary

storage subsystems. The more interesting aspect of clus-

ter computing is in support of coordinated and interacting

tasks, a form of parallel computing, where a single job

is partitioned into a number of concurrent tasks that must

cooperate among themselves. It is this form of cluster com-

puting and the necessary hardware and software systems

that support it that are discussed in the remainder of this

article.

Performance scalability. Clustering of computer

nodes provides the means of assembling larger systems

than is practical for custom parallel systems, as these them-

selves can become nodes of clusters. Many of the entries

on the Top 500 list of the world’s most powerful com-

puters are clusters and the most powerful general-purpose

computer under construction in the United States (DOE

ASCI) is a cluster to be completed in 2003.

•

Performance to cost. Clustering of mass-produced

computer systems yields the cost advantage of a market

much wider than that limited to the high-performance

computing community. An order of magnitude price-

performance advantage with respect to custom-designed

parallel computers is achieved for many applications.

•

Flexibility of conﬁguration. The organization of clus-

ter systems is determined by the topology of their inter-

connection networks, which can be determined at time

of installation and easily modiﬁed. Depending on the re-

quirements of the user applications, various system con-

ﬁgurations can be implemented to optimize for data ﬂow

bandwidth and latency.

•

II. A TAXONOMY OF CLUSTER

COMPUTING

Ease of upgrade. Old components may be replaced or

new elements added to an original cluster to incrementally

improve system operation while retaining much of the

initial investment in hardware and software.

•

Cluster computing is an important class of the broader

domain of parallel computer architecture that employs

a combination of technology capability and subsystem

replication to achieve high performance. Parallel com-

puter architectures partition the total work to be performed

into many smaller coordinated and cooperating tasks and

distribute these tasks among the available replicated pro-

cessing resources. The order in which the tasks are per-

formed and the degree of concurrency among them are

determined in part by their interrelationships, precedence

constraints, type and granularity of parallelism exploited,

and number of computing resources applied to the com-

bined tasks to be conducted in concert. A major division

of parallel computer architecture classes, which includes

cluster computing, includes the following primary (but not

exhaustive) types listed in order of their level of internal

communication coupling measured in terms of bandwidth

Architecture convergence. Cluster computing offers

a single general strategy to the implementation and appli-

cation of parallel high-performance systems independent

of speciﬁc hardware vendors and their product decisions.

Users of clusters can build software application systems

with conﬁdence that such systems will be available to sup-

port them in the long term.

•

Technology tracking. Clusters provide the most

rapid path to integrating the latest technology for high-

•

Cluster Computing

(communication throughput) and latency (delay in trans-

fer of data). This taxonomy is illustrated in Figure 1 . Such

a delineation is, by necessity, somewhat idealized because

many actual parallel computers may incorporate multiple

forms of parallel structure in their speciﬁc architecture.

Also, the terminology below reﬂects current general us-

age but the speciﬁc terms below have varied in their def-

inition over time (e.g., “MPP” originally was applied to

ﬁne-grain SIMD computers, but now is used to describe

large MIMD computers).

ers (e.g., Intel Paragon, TMC CM-5), shared among all

CPUs without cache coherency (e.g., CRI T3E), shared in

SMPs (symmetric multiprocessors) with uniform access

times and cache coherence (e.g., SGI Oracle), or shared

in DSMs (distributed shared memory) with nonuniform

memory access times (e.g., HP Exemplar, SGI Origin).

5. Cluster computing. Integrates stand-alone computers

devised for mainstream processing tasks through local-

area (LAN) or system-area (SAN) interconnection net-

works and employed as a singly administered computing

resource (e.g., Beowulf, NOW, Compaq SC, IBM SP-2).

6. Distributed Internet computing. Employs wide-area

networks (WANs) including the Internet to coordinate

multiple separate computing systems (possibly thousands

of kilometers apart) under independent administrative

control in the execution of a single parallel task or work-

load. Previously known as metacomputing and including

the family of GRID management methods, this emergent

strategy harnesses existing installed computing resources

to achieve very high performance and, when exploiting

otherwise unused cycles, superior price/performance.

1. Vector processing. The basis of the classical su-

percomputer (e.g., Cray 1), this ﬁne-grain architec-

ture pipelines memory accesses and numeric operations

through one or more multistage arithmetic units super-

vised by a single controller.

2. Systolic. Usually employed for special-purpose com-

puting (e.g., digital signal and image processing), systolic

systems employ a structure of logic units and physical

communication channels that reﬂect the computational or-

ganization of the application algorithm control and data

ﬂow paths.

3. SIMD. This Single instruction stream, multiple data

stream or SIMD family employs many ﬁne- to medium-

grain arithmetic/logic units (more than tens of thousands),

each associated with a given memory block (e.g., Maspar-

2, TMC CM-5). Under the management of a single system-

wide controller, all units perform the same operation on

their independent data each cycle.

4. MPP. This multiple instruction stream, multiple data

stream or MIMD class of parallel computer integrates

many (from a few to several thousand) CPUs (central pro-

cessing units) with independent instruction streams and

ﬂow control coordinating through a high-bandwidth, low-

latency internal communication network. Memory blocks

associated with each CPU may be independent of the oth-

Cluster computing may be distinguished among a num-

ber of subclasses that are differentiated in terms of the

source of their computing nodes, interconnection net-

works, and dominant level of parallelism. A partial clas-

siﬁcation of the domain of cluster computing includes

commodity clusters (including Beowulf-class systems),

proprietary clusters, open clusters or workstation farms,

super clusters, and constellations. This terminology is

emergent, subjective, open to debate, and in rapid tran-

sition. Nonetheless, it is representative of current usage

and practice in the cluster community.

Adeﬁnition of commodity clusters developed by con-

sensus is borrowed from the recent literature and reﬂects

their important attribute; that they comprise components

Parallel Computing

Vector

Distributed

Systolic

SIMD

MPP

Cluster

Constellations

Farms

Commodity

Proprietary

Super

Beowulf

NOW

FIGURE 1 Taxonomy of cluster computing.

Cluster Computing

that are entirely off-the-shelf, i.e., already developed and

available for mainstream computing:

timized for parallel computation, these open clusters are

best employed for weakly interacting distributed work-

loads. Software tools such as Condor facilitate their use

while incurring minimum intrusion to normal service.

Super clusters are clusters of clusters. Principally found

within academic, laboratory, or industrial organizations

that employ multiple clusters for different departments or

groups, super clusters are established by means of WANs

integrating the disparate clusters into a single more loosely

coupled computing confederation.

Constellations reﬂect a different balance of parallelism

than conventional commodity clusters. Instead of the pri-

mary source of parallelism being derived from the number

of nodes in the cluster, it is a product of the number of pro-

cessors in each SMP node. To be precise, a constellation

is a cluster in which there are more processors per SMP

node than there are nodes in the cluster. While the nodes of

a constellation must be COTS, its global interconnection

network can be of a custom design.

Of these, commodity clusters have emerged as the most

prevalent and rapidly growing segment of cluster comput-

ing systems and are the primary focus of this article.

A commodity cluster is a local computing system comprising

a set of independent computers and a network interconnecting

them. A cluster is local in that all of its component subsystems

are supervised within a single administrative domain, usually

residing in a single room and managed as a single computer sys-

tem. The constituent computer nodes are commercial-off-the-

shelf, are capable of full independent operation as is, and are of

a type ordinarily employed individually for stand-alone main-

stream workloads and applications. The nodes may incorporate

a single microprocessor or multiple microprocessors in a sym-

metric multiprocessor (SMP) conﬁguration. The interconnection

network employs COTS LAN or SAN technology that may be

a hierarchy of or multiple separate network structures. A cluster

network is dedicated to the integration of the cluster compute

nodes and is separate from the cluster’s external (worldly) envi-

ronment. A cluster may be employed in many modes including

but not limited to high capability or sustained performance on

a single problem, high capacity or throughput on a job or pro-

cess workload, high availability through redundancy of nodes,

or high bandwidth through multiplicity of disks and disk access

or I/O channels.

Beowulf-class systems are commodity clusters employ-

ing personal computers (PCs) or small SMPs of PCs as

their nodes and using COTS LANs or SANs to provide

node interconnection. A Beowulf-class cluster is hosted

by an open source Unix-like operating system such as

Linux. A Windows-Beowulf system runs the mass-market

widely distributed Microsoft Windows operating systems

instead of Unix.

Proprietary clusters incorporate one or more com-

ponents that are custom-designed to give superior sys-

tem characteristics for product differentiation through

employing COTS components for the rest of the cluster

system. Most frequently proprietary clusters have incor-

porated custom-designed networks for tighter system cou-

pling (e.g., IBM SP-2). These networks may not be pro-

cured separately (unbundled) by customers or by OEMs

for inclusion in clusters comprising other than the speciﬁc

manufacturer’s products.

Workstation farms or open clusters are collections

of previously installed personal computing stations and

group shared servers, loosely coupled by means of one

or more LANs for access to common resources, that, al-

though primarily employed for separate and independent

operation, are occasionally used in concert to process sin-

gle coordinated distributed tasks. Workstation farms pro-

vide superior performance/price over even other cluster

types in that they exploit previously paid-for but other-

wise unused computing cycles. Because their intercon-

nection network is shared for other purposes and not op-

III. A BRIEF HISTORY OF

CLUSTER COMPUTING

Cluster computing originated within a few years of the in-

auguration of the modern electronic stored-program digi-

tal computer. SAGE was a cluster system built for NORAD

under an Air Force contract by IBM in the 1950s based on

the MIT Whirlwind computer architecture. Using vacuum

tube and core memory technologies, SAGE consisted of

a number of separate stand-alone systems cooperating to

manage early warning detection of hostile airborne intru-

sion of the North American continent. Early commercial

applications of clusters employed paired loosely coupled

computers with one performing user jobs while the other

managed various input/output devices.

Breakthroughs in enabling technologies occurred in the

late 1970s, both in hardware and software, that were to

have a signiﬁcant long-term effect on future cluster com-

puting. The ﬁrst generations of microprocessors were de-

signed with the initial development of VLSI technology

and by the end of the decade the ﬁrst workstations and

personal computers were being marketed. The advent of

Ethernet provided the ﬁrst widely used LAN technology,

creating an industry standard for a modest cost multidrop

interconnection medium and data transport layer. Also at

this time, the multitasking Unix operating system was cre-

ated at AT&T Bell Labs and extended with virtual memory

and network interfaces at UC Berkeley. Unix was adopted

in its various commercial and public domain forms by the

Cluster Computing

scientiﬁc and technical computing community as the prin-

cipal environment for a wide range of computing system

classes from scientiﬁc workstations to supercomputers.

During the decade of the 1980s, increased interest in

the potential of cluster computing was marked by impor-

tant experiments in research and industry. A collection of

160 interconnected Apollo workstations was employed as

a cluster to perform certain computational tasks by the

NSA. Digital Equipment Corporation developed a system

comprising interconnected VAX 11/750s, coining the term

cluster in the process. In the area of software, task man-

agement tools for employing workstation farms were de-

veloped, most notably the Condor software package from

the University of Wisconsin. The computer science re-

search community explored different strategies for paral-

lel processing during this period. From this early work

came the communicating sequential processes model

more commonly referred to as the message-passing model ,

which has come to dominate much of cluster computing

today.

An important milestone in the practical application of

the message passing model was the development of PVM

(parallel virtual machine), a library of linkable functions

that could allow routines running on separate but net-

worked computers to exchange data and coordinate their

operation. PVM, developed by Oak Ridge National Lab-

oratory, Emory University, and University of Tennessee,

was the ﬁrst major open distributed software system to

be employed across different platforms. By the beginning

of the 1990s, a number of sites were experimenting with

clusters of workstations. At the NASA Lewis Research

Center, a small cluster of IBM workstations was used to

simulate the steady-state behavior of jet aircraft engines

in 1992. The NOW (Network of Workstations) project at

UC Berkeley began operation of the ﬁrst of several clusters

there in 1993 that led to the ﬁrst cluster to be entered on the

Top 500 list of the world’s most powerful computers. Also

in 1993, one of the ﬁrst commercial SANs, Myrinet, was

introduced for commodity clusters, delivering improve-

ments in bandwidth and latency an order of magnitude

better than the Fast Ethernet LAN most widely used for

the purpose at that time.

The ﬁrst Beowulf-class PC cluster was developed at

NASA’s Goddard Space Flight Center in 1994 using early

releases of the Linux operating system and PVM running

on 16 Intel 100-MHz 80486-based PCs connected by dual

10-Mbps Ethernet LANs. The Beowulf project developed

the necessary Ethernet driver software for Linux and ad-

ditional low-level cluster management tools and demon-

strated the performance and cost effectiveness of Beowulf

systems for real-world scientiﬁc applications. That year,

based on experience with many other message-passing

software systems, the parallel computing community set

out to provide a uniform set of message-passing semantics

and syntax and adopted the ﬁrst MPI standard. MPI has

become the dominant parallel computing programming

standard and is supported by virtually all MPP and clus-

ter system vendors. Workstation clusters running the Sun

Microsystems Solaris operating system and NCSA’sPC

cluster running the Microsoft NT operating system were

being used for real-world applications.

In 1996, the Los Alamos National Laboratory and the

California Institute of Technology with the NASA Jet

Propulsion Laboratory independently demonstrated sus-

tained performance of more than 1-Gﬂops for Beowulf

systems costing under $50,000 and was awarded the

Gordon Bell Prize for price/performance for this accom-

plishment. By 1997 Beowulf-class systems of more than

100 nodes had demonstrated sustained performance of

greater than 10 Gﬂops with a Los Alamos system making

the Top 500 list. By the end of the decade, 28 clusters

were on the Top 500 list with a best performance of more

than 500 Gﬂops. In 2000, both DOE and NSF announced

awards to Compaq to implement their largest computing

facilities, both clusters of 30 and 6 Tﬂops, respectively.

IV. CLUSTER HARDWARE COMPONENTS

Cluster computing in general and commodity clusters in

particular are made possible by the existence of cost-

effective hardware components developed for mainstream

computing markets. The capability of a cluster is de-

termined to ﬁrst order by the performance and stor-

age capacity of its processing nodes and the bandwidth

and latency of its interconnection network. Both cluster

node and cluster network technologies evolved during the

1990s and now exhibit gains of more than two orders of

magnitude in performance, memory capacity, disk stor-

age, and network bandwidth and a reduction of better

than a factor of 10 in network latency. During the same

period, the performance-to-cost ratio of node technology

has improved by approximately 1000. In this section, the

basic elements of the cluster node hardware and the alter-

natives available for interconnection networks are brieﬂy

described.

A. Cluster Node Hardware

The processing node of a cluster incorporates all of

the facilities and functionality necessary to perform a

complete computation. Nodes are most often structured

either as uniprocessor systems or as SMPs although

some clusters, especially constellations, have incorpo-

rated nodes that were distributed shared memory (DSM)

systems. Nodes are distinguished by the architecture of

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: