Learning a Semantically Relevant Multiple Sub-space Visual Dictionary for Object Recognition


This paper presents a novel approach to learning a visual dictionary from sub-manifolds, using co-clustering, where each sub-manifold is associated with a semantically relevant part of a visual category. The standard dictionary learning technique, called `Bag-of-Features’ is limited by problems of high-dimensionality, sparsity, and noise associated with affine invariant feature descriptors. Our approach draws inspiration from the relation between object part-based models; semantic topic models; non-negative matrix factorization of multivariate data; and sub-spaces in feature space, to resolve these issues in learning a dictionary. We use co-clustering, which performs simultaneous clustering and dimensionality reduction in an optimal way, to discover multiple semantically relevant sub-spaces. We use an information-theoretic and Euclidean divergence based co-clustering. Our approach is comprehensively evaluated on several popular datasets. This work constitutes a principled first step towards a semantically meaningful dictionary, with regards to correspondence between object parts and multiple sub-manifolds, and is not intended to compete with state-of-the-art methods like sparse coding. It is specially pertinent for the future for learning a dictionary with increasing complexity of visual categories.

In ICML Workshop Sparsity, Dictionaries and Projections in Machine Learning and Signal Processing, Edinburgh, Scotland, June, 2012.