|
Mahendra R. Patel,
Corporate Consulting Engineer,
Vice President, Systems Engineering
Systems engineering is the engineering of complete systems as
opposed to parts of systems. Exactly what this means depends on
one's point of view. One person's system is another person's
component. From chips to boards to boxes to clusters to networks,
subsystems are combined into ever larger and more complex
aggregates. At Digital, systems engineering means the engineering
of systems at a level of aggregation above individual hardware or
software products. Individual processors, storage subsystems,
network hubs, operating systems, database systems, and
applications are viewed as components of the system. For example,
a nationwide network for interactive securities trading, built
from hundreds of nodes at dozens of sites, is one system.
A number of trends in the computer industry make it more
challenging for a computer company to practice systems
engineering:
- Commoditization: Component products, from microprocessors
to applications, are increasingly becoming low-cost,
high-volume commodities. Ironically, as the cost of the
components drops, the cost of integrating them into
complete systems becomes a larger fraction of total
system cost.
- Distributed systems: While they provide new opportunities
for better performance, scaling, and fault-tolerance,
distributed systems also present new engineering
challenges for ensuring these same attributes.
- Heterogeneous systems: Increasingly, computers from a
variety of vendors, running a variety of operating
systems, are being connected together and are expected
to work together correctly.
- Complexity: Distributed systems are becoming more complex
for a number of reasons. The number of components is
growing. The number of types of components that must work
together is growing. And the variety of unique
configurations is growing.
Over the last decade, the computer industry has changed from one
that offered vertically integrated systems built from proprietary
CPUs, disks, networks, operating systems, and layered products to
one that produces commodity products conforming to de jure or de
facto standards. Unlike the manufacture of automobiles or
aircraft, a single computer manufacturer seldom produces all the
components of a complete working system. The hardware, system
software, and applications often come from three different
vendors. Systems engineering, as now practiced in the computer
industry, places less emphasis on top-down design of hardware and
software components and their interfaces to meet system-level
goals. Rather, it is based on anticipating a broad spectrum of
system designs.
From the point of view of a computer company, systems engineering
must now be concerned with assemblies of commodity hardware and
software products. Thus, four areas are of special interest to
systems engineering in the computer industry: interoperability,
performance, scalability, and availability.
Interoperability of components, including components from
different vendors, is difficult to verify because of the
virtually infinite number of possible combinations of components.
For example, the introduction of a new component often can expose
bugs in system components previously thought to be working.
Systems engineering work in this area includes the development of
tools for effective testing and the development of industry
standards for interoperability.
The performance of a system can depend in a complicated way on
the performance of its components. Sophisticated tools are needed
to predict the performance of a complex system from the
performance of its parts, or to diagnose subtle interactions
between components that affect performance. Today, performance
tools for distributed systems are not as sophisticated as those
for individual computers.
Scalability refers to the ability of a system to start small and
grow big. Size may be measured in terms of numbers of users,
computers, disks, applications, or a combination of parameters.
The ability to scale up distributed systems over two orders of
magnitude by adding components is one of their most attractive
attributes. However, scaling effectively requires careful
analysis and design of the system. For example, a system design
based on cost-effective packaging of functionality at a small
scale can exhibit bottlenecks as computers are added to the
system to handle increased workloads.
A distributed system is inherently less reliable unless care is
taken to improve availability by adding redundant components.
Simply partitioning functionality between a client and server
computer requires that both the client and the
server be working for the functionality to be available. Given
technology with the same failure and repair characteristics,
distributing functionality between two computers results in a
system that is less available than one with the complete
functionality on one computer. Often this is an academic point in
simple systems, given the levels of component reliability.
However, distributed systems with critical availability
requirements (e.g., a nationwide network for interactive
securities trading) demand careful analysis and design to add
appropriate redundancy.
Systems engineering is important to Digital because even the best
component products are of no value to customers until they are
integrated into complete working systems that meet business
needs. Ideally, one would like to be able to build large, complex
systems by simply snapping together small, simple components, as
if they were Lego blocks. It is tempting to assume that this
should be easy because many of the components are available as
inexpensive, mass-produced, reliable commodities. However,
building complex systems from simple parts is still difficult and
requires engineering work, especially when the overall system
stretches the limits of the technology.
Systems engineers play a vital role in major systems integration
projects that push the edge of the technology envelope in some
way. The system may combine components that have never before
been used together. The trend toward more heterogeneous systems
makes this more likely. The system may stretch scaling limits by
having more nodes or network connections or users or data than
ever before. The trend toward large distributed systems makes
this scaling possible. The system may need to meet very demanding
requirements for overall system performance or dependability.
Increasingly, heterogeneous, distributed systems are being used
for mission-critical business applications.
Engineering analysis and design is needed at all phases of a
complex integration project, from the definition of the technical
requirements to the design of the system to final testing and
verification. Custom software or hardware may need to be
developed, either to glue together components that were not built
to work together or to substitute for standard components in
order to meet demanding requirements for performance or scaling.
Systems engineers also develop tools and methods to simplify the task of
integrating complete systems. Digital's systems engineers are active in the
development of industry standards for ensuring the interoperability of
components from different vendors. In this issue of the Journal, Eric
Newcomer's paper describes the development of standards for use in the
telecommunications industry. Often, a system has legacy components.
Digital's systems engineers are also active in the development of
frameworks that apply object-oriented programming technologies to
encapsulate legacy applications and data, simplifying the incorporation of
legacy components into new systems. A framework for the integration of
manufacturing applications is described in the paper by James Kirkley and
William Nichols. The Systems Engineering group has developed a variety of
test tools and methods, and operates an extensive laboratory for testing,
verification, and performance characterization of combinations of products
from Digital and other vendors. Technical data from testing and
characterization is the basis for configuration guidelines for systems
intended to run a number of popular commercial applications.
Computers, disks, network switches, database systems, desktop
applications, and many other components are now available as
inexpensive, reliable commodities. Hardware and software
components from various manufacturers can be put together to
build a wide variety of systems, from one as simple as a PC to
one as complex as a worldwide distributed system.
While the cost of the components has dropped dramatically in
recent years, the cost of integrating these simple components
into complex distributed systems remains high and therefore
represents a larger fraction of the total cost of the system.
Today, Digital's ability to successfully build complex
distributed systems provides great value for our customers, often
greater than the value of the commodity components from which the
systems are built. For the future, improvements in tools and
methods for building complex systems will lower the cost of these
systems significantly, making new types of applications feasible
and affordable.
Lego is registered trademark of Interlego AG.
|
|