hp home products & services support solutions how to buy
spacer
hp logo - invent
corner hp labs corner
search search
contact hp contact hp
hp labs home hp labs home
about hp labs about hp labs
research research
news and events news and events
careers @ labs careers @ labs
technical reports technical reports
talks and speeches talks and speeches
worldwide sites worldwide sites
corner corner
spacer
 
FastReplica: Large File Distribution within CDNs

Content Delivery Networks (CDNs) are based on a large-scale distributed network of servers located closer to the edges of the Internet for efficient delivery of digital content including various forms of multimedia content. The main goal of the CDN's architecture is to minimize the network impact in the critical path of content delivery as well as to overcome a server overload problem that is a serious threat for busy sites serving popular content.

For typical web documents (e.g. html pages and images) served via CDN, there is no need for active replication of the original content at the edge servers. The CDN's edge servers are the caching servers, and if the requested content is not yet in the cache, this document is retrieved from the original server, using the so-called pull model. The performance penalty associated with initial document retrieval from the original server, such as higher latency observed by the client and the additional load experienced by the original server, is not significant for small to medium size web documents.

For large documents, software download packages and media files, a different operational mode is preferred: it is desirable to replicate these files at edge servers in advance, using the so-called push model. For large files it is a challenging, resource-intensive problem, e.g. media files can require significant bandwidth and download time due to their large sizes: 20~min media file encoded at 1 Mbit/s results in a file of 150 MBytes.

While transferring a large file with individual point-to-point connections from an original server can be a viable solution in the case of limited number of mirror servers (tenths of servers), this method does not scale when the content needs to be replicated across a CDN with thousands of geographically distributed machines.

In our work, we consider a geographically distributed network of servers and a problem of content distribution across it. Our focus is on distributing large size files such as software packages or stored streaming media files (also called as on-demand streaming media). We propose a novel algorithm, called FastReplica, for efficient and reliable replication of large files. There are a few basic ideas exploited in FastReplica.

Related Papers and Reports


printing icon
printing instructions printing instructions
Privacy Statement Legal Notices © 1994-2001 Hewlett-Packard Company