TITLE: A Quadratic-time, Sequential, Adaptive Algorithm for Lossy Compression
SPEAKER: Dharmendra Modha (IBM Almaden Research Labs)
DATE: 2:00-3:00 P.M., Tuesday, May 27, 2003
LOCATION: Half Dome, 3L (PA)
HOST: Vinay Deolalikar
ABSTRACT:
Lossy compression is a fundamental problem in
information theory dating
back to Shannon (1959). Inspite of intense research, no polynomial-time,
sequential, adaptive algorithm is
currently known for universal lossy compression.
I propose a new algorithm, termed, Codelet Parsing, for lossy compression
that is quadratic-time, sequential, and adaptive. It is not currently known if
Codelet Parsing is
universal.
The algorithm sequentially parses a given source sequence into phrases,
say, sourcelets, and maps each
sourcelet to a distorted phrase, say, a codelet, such that the per-letter
distortion between the two
phrases does not exceed the desired distortion. The algorithm adaptively
maintains a codebook (a set of
codewords), and does not require any a priori knowledge of the source
statistics. The algorithm uses
approximate string matching and, as key new idea, at each epoch, carefully
selects one of the many
approximately matching codewords to balance between the code rate in the
current epoch versus the code rate
from resulting codebooks in future epochs. The algorithm outputs a
distorted sequence that can be naturally
losslessly compressed using the Lempel-Ziv (LZ78) algorithm.
This talk will be self-contained and will be accessible to a general EE/CS
audience.
Home page: http://www.almaden.ibm.com/cs/people/dmodha
|