Recent two-way collaboration prototypes attempt to improve natural
interactivity, correct eye contact and gaze direction, and media
sharing using novel configurations of projectors, screens, and video
cameras. These systems are often afflicted by video cross-talk
where the content displayed for viewing by the local participant is
unintentionally captured by the camera and delivered to the remote
participant. Prior attempts to reduce this cross-talk purely in
hardware through various forms of multiplexing (e.g., temporal,
wavelength (color), polarization) have performance and cost
limitations. In this work, careful system characterization and
subsequent signal processing algorithms allow us to reduce video
cross-talk. The signals themselves are used to detect temporal
synchronization offsets which then allow subsequent reduction of the
cross-talk signal. Our software-based approach enables the effective
use of simpler hardware and optics than prior methods. Results show
substantial cross-talk reduction in a system with unsynchronized
projector and camera.
Improving inexpensive web-cams for stationary background scenes
Video conferencing without controlled lighting suffers from the
spurious automatic exposure (AE) errors commonly seen in
webcams. These errors cause problems for the subsequent processing and
compression. For example, since video encoders do not model intensity
changes, these AE errors in turn cause severe blocking artifacts. We
develop a pixel-domain AE conditioning algorithm for stationary
cameras that: 1) effectively reduces spurious AE changes, resulting in
natural and artifact-free video; 2) allows maximum compatibility with
third party components (may be transparently inserted between any
camera driver and encoder/video processing engine); and 3) is fast
and requires little memory. This algorithm allows inexpensive cameras
to provide higher quality video conferencing. We describe the
algorithm, analyze its performance exactly for a specific video source
model and validate its performance experimentally using captured
video.
Image matting from a physical perspective
Image and video matting is used to extract objects
from their original backgrounds in order to place them on
a different background. The traditional matting model is a
combination of foreground and background colors, i.e., I =
aF + (1 - a)B. Even with good cameras, limited depth-offocus
means that often both the object and background are
blurred at the boundaries. Does the matting model still apply?
To understand this and other cases better, we investigate image
matting from a physical perspective. We start with 3D objects
and examine the mechanism of image matting due to geometrical
defocus. We then derive a general matting model using radiometry
and geometric optics. The model accounts for arbitrary
surface shapes, defocus, transparency, and directional radiance.
Under certain conditions, the physical framework subsumes the
traditional matting equation. The new formulation reveals a
fundamental link between parameter a and object depth, and this
establishes a framework for designing new matting algorithms.
Video relighting using IR illumination
Casual, ad-hoc video conferences may suffer from bad illumination.
In studio settings, lighting is improved by applying bright lights
to the subjects. In casual settings we feel it is more appropriate to
use invisible IR illumination to light the subjects. Subsequently, we
improve the lighting of the images captured with visible light sensitive
cameras. We are addressing the difficult problem of mapping the IR information
to help enhance the visible information.
For further information, contact:
ramin (dot) samadani (at) hp (dot) com
|