Welcome to the home page of the System Architecture
group.
We are a small team of researchers with expertise in
several areas of computer system software and hardware
including multiprocessor architecture, interconnect
architecture, I/O system
architecture and operating systems. To explore our research
and openings, send email to G. John Janakiraman (john.janakiraman "at" hp.com
)
Objective
Internet Services are typically served from data
centers,
which are large clusters of interconnected compute and
storage nodes. Our research addresses challenges in
designing a data center infrastructure that can support
a variety of services and cope with wide fluctuation
in the demand for services as well as with intermittent
congestion and outage of resources.
Description
Internet services must be hosted on a system infrastructure
with performance and availability capabilities sufficient
to sustain concurrent demand from a large client population
and deliver service continuously around the clock. Large
clusters of relatively small servers with necessary
network and storage devices is the infrastructure architecture
of choice for hosting these services since they offer
a straightforward approach to achieve the performance
and availability objectives. The size of the cluster
can be scaled to increase parallelism and improve performance;
the small server granularity isolates failures and the
component redundancy allows the aggregate system to
tolerate failures and degrade gracefully. The overall
performance and availability of this collective server
farm is fundamentally dependent on the architecture
of the server farm, which include its networking and
storage subsystems and their mechanisms and protocols,
as well as on its "operating system" functions,
which include distributed resource management mechanisms
and policies for purposes such as load balancing and
failure recovery.
Our research addresses three challenges in designing
the architecture and "operating system" of
this datacenter infrastructure:
- How can the infrastructure support a diverse
range of services (e.g., financial services, e-commerce
portals, media streaming, search engines)? These
services will differ in many ways: their components,
their communication, consistency and availability
models and their performance and availability requirements.
While this diversity demands a flexible infrastructure,
the design should not compromise cost-effectiveness.
- What are the right building blocks and how should
they be composed? The infrastructure can be
designed using hardware and software building blocks
that vary widely in their properties and capabilities
(e.g., thin nodes vs. heavier nodes that support
virtual partitioning). The manner in which hardware
and software building blocks are composed (e.g.,
the partitioning of the infrastructure control state
and function, the interconnect topology) is another
critical design element. These design decisions
will determine whether the infrastructure is scalable,
modular and offers the right measure of control.
- How can performance and availability requirements
of services be met in the face of changes in demand
and resource availability? The set of services
deployed in a datacenter infrastructure as well
as the demand for individual services will change
over time. While it is not cost-effective to provision
enough resources to meet peak demand all the time,
inadequate resource allocation can also seriously
impact quality of service. Resource congestion and
resource outages in systems of the size necessary
to host these services is also inevitable. Provisioning
adequate redundancy and appropriate recovery mechanisms
is essential to meet service requirements.
We are currently examining the software and hardware
architecture of servers, particularly their I/O subsystems,
to improve their performance, flexibility, and packaging
density in data center environments.
We are also examining the design and management of
the data center infrastructure for availability.
Papers and Reports
- G. (John) Janakiraman, Jose Renato Santos, Yoshio
Turner, "Automated
Multi-Tier System Design for Service Availability
," Tech. Rep. HPL-2003-109, HP Laboratories,
May 2003.
- G. (John) Janakiraman, Jose Renato Santos, Yoshio
Turner, "Automated
Multi-Tier System Design for Service Availability
," Workshop on the Design of Self-Managing Systems
held with the 2003 International Conference on Dependable
Systems and Networks, June 2003, San Francisco, CA.
Presentations
Related Investigations
If you are
inside HP, you can get more information from our internal
page
. |
![](http://welcome.hp-ww.com/img/s.gif) |
|