|
|
Monitoring of
Distributed and Component-Based Systems |
Large complex and distributed applications, from
enterprise service-centric applications to embedded
device applications, have been developed increasingly
with middleware component technologies like CORBA, RMI/J2EE,
and COM/.NET. Each middleware framework offers a
distributed and multithreaded computing platform, and
controls component interactions through component
interfaces. These interfaces encapsulate detailed
component implementation, including the programming
language (e.g., C/C++, Java) and the underlying
operating system (e.g., Unix, Windows, VxWorks). The IDL
compiler is often used to automate code generation from
component interfaces to simplify application
development.
During the development and evolution of such a
large-scale component-based distributed application,
common questions are:
- What caused the system to crash?
- Where was time spent?
- How to package different components onto
processors? Does the second processor really provide
significant performance enhancement?
- Have all the specified component interactions
and paths been exercised in the integration testing?
- If a component is changed, what are the units
and subsystems that are required to perform
regression testing?
Our research effort here at the Imaging Systems Lab
is to develop new techniques and tools to help answer
the above questions, with the focus on the applications
that are running in embedded devices, and have complex
interactions with enterprise services.
We have developed a monitoring toolset called coSpy
for distributed and component-based systems. So far, the
toolset has been applied to the applications that are
built upon the ORBlite (CORBA-based runtime developed by
HPL), or a particular COM-like infrastructure (developed
by HP). The uniqueness of our approach is to facilitate
system-wide causality propagation in the monitored
application, with no modification to application
components. One of the most important causality
information is the caller/callee relationship due to
component-level method (call) invocations. These
coarse-grained invocations are the ones that have their
user-defined IDL interface definitions, and potentially
happen across different threads, processes, and even
processors. The coSpy toolset consists of three parts:
the augmented IDL compiler to automatically deploy
instrumentation probes onto the stubs and skeletons of
application components, the monitoring-related library
to collect per-thread runtime execution information
across the entire distributed application, and the
postmortem analysis framework to analyze the collected
monitoring data, determine inter-component interactions,
and annotate timing latency and CPU consumption
information onto each component-level method invocation.
In particular, we are able to construct a system-wide
dynamic call graph that structures all component-level
method invocations ever happened in the monitored
application run, following their caller/callee
relationships. An example of our dynamic system call
graph from a large-scale application, configured with 4
processes and 32 threads, is shown below. Over 100,000
component-level method invocations are contained in this
particular graph. Each invocation is represented by a
graph node. The hyperbolic tree SDK from
Inxight is employed
to visualize the call graph. |
|
|