Click here for full text:
Application Programming on a Shared Memory Multicomputer
Poynor, Todd; Wylegala, Tom
HPL-2000-114
Keyword(s): shared memory; fault containment; recovery; memory failures; multicomputers
Abstract: We present our experience investigating issues involved in writing applications for a multicomputer comprised of computing nodes coupled through global memory. Of primary interest is fault containment, preventing faults in one component from spreading to other components, and the means by which applications can increase their availability using shared state to aid in recovery from failures. Our architecture pursues application-level fault containment as strong as that of shared-nothing clusters within a shared- almost-everything environment. Applications freely employ a variety of global resources and explicitly recover from failures. We examine the challenges of programming in such an environment and investigate support our platform could provide to aid developers. We demonstrate these issues using a Web server file cache and present some performance scalability analysis. We also examine business issues related to the acceptance of such a platform in the commercial application marketplace.
30 Pages
Back to Index
|