A
symposium in anticipation of ACCVÕ10
CURRENT TRENDS IN COMPUTER VISION
8 December 2009
The University of Auckland, Tamaki campus, Building 731
Free attendance
RSVP by 1 Dec 09 to Mrs. May Dijkgraaf (m DOT dijkgraaf AT auckland DOT ac DOT nz)
10 Speakers from
Japan, New Zealand, China, and The Netherlands
09.30 Opening in 731.201 by Winston Byblow,
Associate Dean Science Tamaki campus
SESSION 1: 3D Modeling at Large Scale (9.35 - 11.55)
Session
chair: Sathiamoorthy Manoharan
(Auckland)
Takeshi Oishi (Tokyo, Japan)
e-Heritage
Projects in Italy, Japan, and Cambodia
This talk introduces the Digital Bayon Project, conducted by The University of Tokyo
team, with the cooperation of the Japanese Government team for Safeguarding Angkor,
to scan the Bayon temple and to obtain 3D digital data of the temple. We have conducted
over 1500 person-day scanning missions. During these missions, we obtained range data
from more than 14,600 different directions using commercially available sensors, such as
Cyrax and Z+F, as well as newly developed sensors for this scanning mission, such as the
UTokyo Balloon and UTokyo climbing sensors. The total amount of the data approximates a
quarter of a terabyte.
We have developed parallel alignment processing with merging software that run on a
PC cluster a hundred times faster than previously available software. This cluster
processes our massive range data into unified 3D digital data of the Bayon temple.
As a result of this effort, we have obtained the following 3D data:
1) The entire Bayon 3D structure: By using Cyrax, Z+F, balloon and climbing sensors,
we have obtained a 3D model of the entire Bayon temple. From this model, we
have created floor plans of the temple, and have confirmed that the Bayon
temple is rotated 0.94 degrees counter-clockwise from the exact east-west lines.
2) 173 deity faces. We have scanned all the 173 faces of the deities on the
exterior of the temple using Cyrax and Balloon sensors, analyzed these data,
and verified that we can classify these faces into three categories: Dava,
Davatar, and Asherah. It was also confirmed that there is sufficient
resemblance among groups of faces to support the assumption that more
than one worker group conducted the construction project in a parallel manner.
3) 16 hidden pediments. By using a newly created mirror range sensor, we obtained
pictures of 16 hidden pediments, whose existence had not been previously known.
4) 8 wall reliefs. We obtained 3D digital data of all eight wall reliefs along the
inner and outer corridors using a VIVID sensor.
We plan to continue our efforts to create finer models of the structure,
to fill the holes still missing in parts of the structure, and to complete our
models by adding texture to the 3D digital data.
For some illustration, see these videos.
Tomokazu Sato (Nara, Japan)
Vision-Based
Augmented Reality and Applications
In this talk we introduce our recent activities concerning vision-based
geometric registration between real and virtual worlds for real-time augmented
reality (AR) as well as some AR applications. We have two approaches to
geometric registration: One is based on marker tracking and the other is based
on markerless natural feature tracking. In the former approach we use invisible
visual markers made from retro-reflective materials which do not create
undesirable visual effects in the environment. The latter is based on using a
landmark database which is automatically constructed from omnidirectional image
sequences in advance using a structure-from-motion technique. We have developed
some prototype AR systems for such applications as indoor and outdoor
navigation, augmented sightseeing of historical sites, and pre-visualization
for filmmaking. These will be demonstrated using videos in this talk.
Michael Cree
(Hamilton, New Zealand)
Range
Imaging Camera Technology
Time-of-flight range imaging is a relatively new technology for the
simultaneous acquisition of 3D point data over a full field of view. Current
technology provides for low resolution (fewer than 200x200 pixels) and suffers
from limitations such as a fixed focal length and erroneous measurements due to
multi-path reflections and mixed pixels. I discuss currently available commercial
cameras, their limitations, and impacts those limitations have on applications.
I also outline the advances in camera technology and the developments in signal
and image processing techniques that are intended to move this technology to a
stage where it will be useful in a wide variety of applications.
John Morris
(Auckland, New Zealand)
Real-Time
Stereo Analysis
We have implemented a real time stereo vision system capable of
processing high resolution (1Mpixel or more) images at 30 fps with disparity
ranges of 100 pixels or more. This system has a fast rectification module
associated with each camera which uses a look up table approach to remove lens
distortion and correct camera misalignment in a single step. The corrected,
aligned images are then passed to correspondence circuit which generates
disparity and occlusion maps with a latency of two camera scan lines.
The matching
algorithm is a version of the Symmetric Dynamic Programming Stereo (SDPS)
algorithm which has a small, compact hardware realization, permitting many
copies to be instantiated to accommodate large disparity ranges. Snapshots from
videos taken in our laboratory demonstrate that the system can produce precise
depth maps in real time. The occlusion maps that the SDPS algorithm produces
clearly outline distinct objects in scenes and present a powerful tool for
segmenting scenes rapidly into objects of interest. A fast contouring procedure
running in the host has been developed which produces contour maps from the
disparity and occlusion maps in real time.
11.55 LUNCH BREAK: food and non-alcoholic
beverages, demos and posters
Session 2: Shape and Matching (13.00 - 14.45)
Session
chair: Enrico Haemmerle
(Auckland, New Zealand)
Hongbin Zha
(Beijing, China)
3D Shape Representation,
Matching and Recognition
Development of new methods for describing 3D shapes is an important
topic in object recognition, model-based manipulation, and digital geometry
processing. In the early days of computer vision, an object is usually modeled
with global representations such as constructive solid geometry, generalized
cylinders, or deformed superquadrics. Recently, more sophisticated
representations such as shape distributions are developed, which allow for
matching of objects under general similarity metrics. However, one drawback of
such global schemes is that they are not suitable for matching with scenes
where the target objects are only partially visible due to occlusion or limited
view fields. At the same time, the representations are usually not compact,
making it difficult to embed 3D objects in a shape space for efficient creation
of shape deformation and animation.
In the talk, I will
report our efforts in developing new kinds of shape representations to find
efficient methods for the partial object matching or human face animation. The
topics include: a new shape representation scheme which uses a probabilistic
bag-of-words model; a shape matching algorithm based on a dimension amnesic
pyramid match kernel; a shape space approach to animation of 3D human faces.
Brendan McCane
(Dunedin, New Zealand)
Curve
Matching and Morphometrics
If we have a sample of several or many curves, how can we describe that
sample in an efficient and meaningful way? This is a central question in the
statistics of shape (morphometrics). Even more fundamentally, given only two
such curves, how can those curves be aligned or matched? Morphometrics has been
studied extensively for discrete structures where shape is defined by a finite
set of well defined landmarks and finds application in many areas, although
predominantly in the life sciences. However, well defined landmarks are usually
very sparse and are therefore a rather poor descriptor of shape. In this talk I
will give a brief introduction to morphometrics and show how it can be extended
to smooth curves. I will also present results from the application area of
describing variation in the shape of bone structures in the facial skeleton.
Akihiro Sugimoto (Tokyo, Japan)
3D Shape
Registration using Graph Kernel
This talk presents range image registration for 3D shape modeling where
we formulate registration as a graph-based optimization problem. In this
method, we independently evaluate each feature and consider only the order of
point-to-point matching quality to generate a directed graph representing the
matching problem. Then the maximum kernel of the graph gives the unique largest
consistent matching of points. Our method thus does not require any good initial
estimation and, at the same time, guarantees the global optimality.
14.45 COFFEE BREAK: tea,
coffee and cake, demos and posters
Session 3: Three Areas of Applied Computer Vision (15.15 - 17.00)
Session
chair: Yanxin Zhang
(Auckland)
Luc Florack
(Eindhoven, The Netherlands)
Analyzing
Magnetic Resonance Images
Progress in MRI imaging technology holds great promise for healthcare.
However, successful exploitation of information contained in the resulting
highly complex images is severely hampered by our limited conceptual
understanding. Whereas traditional image analysis paradigms rely heavily on
visual analogy (e.g. edge detection is basically an operationalization of
visually (!) salient structures), new, ÒblindÓ paradigms are needed in order to
quantify or visualize relevant information in Ònon-visualÓ images. That is, the
analysis necessarily precedes any form of visual inspection. I will illustrate
this shift of paradigm in the
context of diffusion MRI, and argue in favor of a theoretical approach.
Yasushi Yagi (Osaka, Japan)
Gait
Analysis and its Applications
In this talk, I address applications using gait
analysis techniques in the fields of visual surveillance and digital entertainment.
First, gait identification has recently gained attention as methods of identification
of individuals at a distance from a camera.
However, appearance changes due to view or walking direction changes cause difficulties
for gait identification systems.
We have developed a multi-view synchronized gait capturing system to construct a
large-scale gait database, and proposed a method of gait identification from
various view directions using frequency-domain features and a View Transformation Model (VTM).
Second, ÒDive into the Movie (DIM)Ó is a name of project to aim to realize a world
innovative entertainment system which can provide an immersion experience into the
story by giving a chance to audience to share an impression with his family or
friends by watching a movie in which all audience can participate in the story
as movie casts.
To realize this system, we are trying to model and capture the personal
characteristics instantly and precisely in face, body, gait, hair and voice.
I present the online method for measuring Êgait features from silhouette
images.
Reinhard Klette
(Auckland, New Zealand)
Vision-Based
Driver Assistance for Safer Roads
Vision-based driver assistance systems (DAS) are currently starting to
be active safety components of cars (e.g., lane departure warning, blind spot
supervision). The talk reviews a few current developments in this area (in
particular within the .enpeda..
project at Tamaki campus, The University of Auckland) which aim at advanced
solutions, using stereo or motion data as basic input for providing accurate
lane or corridor data, for estimating ego-motion, for pedestrian detection, or
for traffic sign recognition.
17.00 Closing: Finger
food and drinks, demos and posters
Sponsors
of the event
The Faculty of Science of The University of Auckland
Department of Computer Science, The University of Auckland
Related
event