 |
|
 |
Content Delivery Networks (CDNs) are based on a large-scale distributed
network of servers located closer to the edges of the Internet for
efficient delivery of digital content including various forms of
multimedia content. The main goal of the CDN's architecture is to
minimize the network impact in the critical path of content delivery
as well as to overcome a server overload problem that is a serious
threat for busy sites serving popular content.
For typical web documents (e.g. html pages and images) served via CDN,
there is no need for active replication of the original content at the
edge servers. The CDN's edge servers are the caching servers, and if
the requested content is not yet in the cache, this document is
retrieved from the original server, using the so-called pull
model. The performance penalty associated with initial document
retrieval from the original server, such as higher latency observed by
the client and the additional load experienced by the original server,
is not significant for small to medium size web documents.
For large documents, software download packages and media files, a
different operational mode is preferred: it is desirable to replicate
these files at edge servers in advance, using the so-called push
model. For large files it is a challenging, resource-intensive
problem, e.g. media files can require significant bandwidth and
download time due to their large sizes: 20~min media file encoded at
1 Mbit/s results in a file of 150 MBytes.
While transferring a large file with individual point-to-point
connections from an original server can be a viable solution in the case
of limited number of mirror servers (tenths of servers), this method
does not scale when the content needs to be replicated across a CDN
with thousands of geographically distributed machines.
In our work, we consider a geographically distributed network of
servers and a problem of content distribution across it. Our focus is
on distributing large size files such as software packages or stored
streaming media files (also called as on-demand streaming media). We
propose a novel algorithm, called FastReplica, for efficient and
reliable replication of large files. There are a few basic ideas
exploited in FastReplica.
Related Papers and Reports
|