Jane C. Blake,
Managing Editor
Scientists have long been motivators for the development of powerful
computing environments. Two sections in this issue of the Journal address
the requirements of scientific and technical computing. The first, from
Digital's High Performance Technical Computing Group, looks at compiler and
development tools that accelerate performance in parallel environments. The
second section looks to the future of computing; University of California
and Digital researchers present their work on a large, distributed
computing environment suited to the needs of earth scientists studying
global changes such as ocean dynamics, global warming, and ozone depletion.
Digital was an early industry sponsor and participant in this joint
research project, called Sequoia 2000.
To support the writing of parallel programs for computationally intense
environments, Digital has extended DEC Fortran 90 by implementing most of
High Performance Fortran (HPF) version 1.1. After reviewing the syntactic
features of Fortran 90 and HPF, Jonathan Harris et al. focus on the HPF
compiler design and explain the optimizations it performs to improve
interprocessor communication in a distributed-memory environment,
specifically, in workstation clusters (farms) based on Digital's 64-bit
Alpha microprocessors.
The run-time support for this distributed environment is the Parallel
Software Environment (PSE). Ed Benson, David LaFrance-Linden, Rich Warren,
and Santa Wiryaman describe the PSE product, which is layered on the UNIX
operating system and includes tools for developing parallel applications on
clusters of up to 256 machines. They also examine design decisions relative
to message-passing support in distributed systems and shared-memory
systems; PSE supports network message passing, using TCP/IP or UDP/IP
protocols, and shared memory.
Michael Stonebraker's paper opens the section featuring Sequoia 2000
research and is an overview of the project's objectives and status. The
objectives encompassed support for high-performance I/O on terabyte data
sets, placing all data in a DBMS, and providing new visualization tools and
high-speed networking. After a discussion of the architectural layers, he
reviews some lessons learned by participants--chief of which was to view
the system as an end-to-end solution--and concludes with a look at future
work.
An efficient means for locating and retrieving data from the vast stores in
the Sequoia DBMS was the task addressed by the Sequoia 2000 Electronic
Repository project team. Ray Larson, Chris Plaunt, Allison Woodruff, and
Marti Hearst describe the Lassen text indexing and retrieval methods
developed for the POSTGRES database system, the GIPSY system for automatic
indexing of texts using geographic coordinates discussed in the text, and
the TextTiling method for automatic partitioning of text documents to
enhance retrieval.
The need for tools to browse through and to visualize Sequoia 2000 data was
the impetus behind Tecate, a software platform on which browsing and
visualization applications can be built. Peter Kochevar and Len Wanger
present the features and functions of this research prototype, and offer
details of the object model and the role of the interpretive Abstract
Visualization Language (AVL) for programming. They conclude with example
applications that browse data spaces.
The challenge of high-speed networking for Sequoia 2000 is the subject of
the paper by Joseph Pasquale, Eric Anderson, Kevin Fall, and Jon Kay. In
designing a distributed system that efficiently retrieves, stores, and
transfers very large objects (in excess of tens or hundreds of megabytes),
they focused on operating system I/O and network software. They describe
two I/O system software solutions--container shipping and peer-to-peer
I/O--that avoid data copying. Their TCP/IP network software solutions
center on avoiding or reducing checksum computation.
The editors thank Jean Bonney, Digital's Director of External Research, for
her help in obtaining the papers on Sequoia 2000 research and for writing
the Foreword to this issue.
Our next issue will feature papers on multimedia and UNIX clusters topics.
|