RESEARCH
INTERESTS
|
|
| Object
and
activity
recognition;
Texture;
Image
and
spatiotemporal
video
segmentation
|
|
RECENT
RESEARCH TOPICS
|
|
Activities as Time
Series of Human Postures
 |
We
show
that
certain
human
actions can be represented by short time series
of codewords. The codewords represent still snapshots of human-body
parts in their discriminative postures, and objects that people
interact with while performing the activity. This carries many
advantages for developing a robust, efficient, and scalable activity
recognition system. Four alternative,
weakly supervised methods for learning a sparse dictionary of the
codewords are formulated within the large-margin framework.
|
|
|
From a Set of
Shapes to
Object Discovery
 |
While
shape
is
widely recognized to play an important role in human perception, most
approaches to recognition rather resort to appearance features.
We show that shape, on its own, without photometric features, is
expressive and discriminative enough to provide robust object discovery
in the midst of background clutter.
We build a graph that captures spatial layouts of edges extracted from
a
set of images, and conduct its multicoloring by
a
new
coordinate
ascent
Swendsen-Wang
cut. The resulting clusters of edges delineate the boundaries of
distinct objects discovered in the image set.
|
|
|
Monocular Extraction of
2.1D Sketch
 |
Given
a
segmentation
and
T-junctions
of
an
image,
we
estimate
the
depth
layers
of
the
scene.
The
estimation
is
formalized
as
a
quadratic
optimization
so
the
resulting
2.1D
sketch
is
smooth
in
all
image
areas
except
on
region
boundaries.
|
|
|
Video Painting with
Space-Time Varying Style Parameters
|
An
input
video
is
rendered
by
applying
a
distinct
painting
style
to
each
spatiotemporal
tube,
corresponding
to
a
moving
object
in
the
video.
Spatiotemporal
segmentation
allows
the
user
a
control
to
vary
painting
styles
in
2D
space
and
time,
and
thus
convey
rich
semantic
content,
e.g.,
emotions,
illusion,
chaos,
etc.
|
|
|
Toward Optimal Feature
Selection through Local Learning
|
Given
data
with
a
huge
number
of
irrelevant
features
(>
10 6),
select
features
relevant
to data classification. We decompose a nonlinear
problem into a set of locally linear ones, and then globally learn
feature relevance
within the large margin framework.
|
|
|
Video Object
Segmentation by Tracking Regions
|
Given
an
arbitrary
video,
segment
all
moving
and
static
objects
present.
We
transitively
match
contours
of
image
regions
across
the
frames
such
that
the
resulting
tracks
are
locally
smooth.
|
|
|
Texel-based Texture
Segmentation
|
Given
an
arbitrary
image,
discover
and
segment
all
distinct
texture
subimages.
We
use
the
meanshift
to
simultaneously
estimate
the
pdf
of
texel
appearance
and
the
pdf
of
texel
placement.
|
|
|
Matching Hierarchies of
Deformable Shapes
|
Shapes
are
represented
by
graphs
whose
nodes
correspond
to
shape
parts,
and
edges
capture
their
neighbor
and
part-of
interactions.
Shape
matching
is
formulated
as
finding
the
subgraph
isomorphism
that
minimizes
a
quadratic
cost.
|
|
|
Dictionary-Free
Categorization Using Evidence Trees
|
How
to
categorize
images
showing
very
similar
object
categories?
We
mathematically
prove
that
it
is
better
to
use
class
evidence
accumulated
from
all
image
features
than
to
use
a
majority
voting
of
class
decisions
made
on
each
individual
feature.
|
|
|
Scale-invariant
Region-based Hierarchical Image Matching
|
Find
correspondences
between
similar
objects
in
images
captured
under
large
variations
in
scale.
Scale
invariance
is
achieved
by
decoupling
the
scales
of
objects
from
those
of
scenes,
and
by
down-weighting
the
contributions
of
fine-resolution
details
to
matching.
|
|
|
Learning
Subcategory Relevances for Category Recognition
|
Detections
of
distinct
object
categories
provide
different
degrees
of
evidence
for
recognition
of
more
complex,
parent
categories.
This
is
estimated
using
local
learning.
|
|
|
Connected
Segmentation Tree
- A Joint
Representation of Region Layout and Hierarchy -
|
CST
is
a
hierarchy
of
region
adjacency
graphs.
The
CST
model
of
an
object
category
is
learned
by
simultaneously
searching
for
both
the
most
salient
regions,
and
the
most
salient
containment
and
neighbor
relationships
of
regions
across
training
images.
|
|
|
Extracting
Texels in 2.1D Natural Textures
|
Given
an
image
of
2.1D
texture,
learn
without
any
supervision
a
generative
model
of
the
entire
(unoccluded)
texel.
Learning
involves
concurrent
estimation
of
the
texel-subtexel
structure,
and
the
pdf's
of
each
texel
part
from
only
partially
visible
texels
in
the
image.
|
|
|
Taxonomy
of Categories
Present in Arbitrary Images
|
Given
an
arbitrary (unlabeled) image set,
learn the models of all visual categories present, and their
inter-category relationships, i.e., their taxonomy. The taxonomy
recursively
defines categories as spatial configurations of (simpler) subcategories
each of which may be shared by many categories.
|
|
|
The
hoofed
animals
dataset
contains
very
similar
categories
that
share
a
number
of
similar
parts.
Each
image
may
contain
multiple
instances
of
multiple
categories.
Animals
are
articulated,
non-rigid
objects,
appearing
at
different
scales
amidst
clutter,
and
may
be
partially
occluded.
|
|
|
The
images
show
homogeneous,
frontally
viewed,
natural,
2.1D
textures,
where:
(1)
Texels
are
only
statistically
similar
to
each
other;
(2)
Texel
placement
is
random;
(3)
Repetition
of
subtexels
define
a
finer
grain
texture
coexisting
with
the
main
texture;
(4)
Due
to
texel
overlap,
texel
contours
form
complex
patterns
(e.g.,
several
edges
meet
at
one
point),
and
overlapping
texels have low contrasts,
all of which makes texel segmentation difficult.
|
|
|
Unsupervised
Category Modeling, Recognition and Segmentation

|
Given
a set of images containing frequent occurrences of an
unknown visual category, learn geometric, photometric and topological
properties of regions defining the category. Learning is unsupervised,
because the target category is not defined by the user, and whether and
where any instances of the category appear in a specific image is not
known.
|
|