Open-Vocabulary Object Retrieval

TitleOpen-Vocabulary Object Retrieval
Publication TypeConference Paper
Year of Publication2014
AuthorsGuadarrama, S., Rodner E., Saenko K., Zhang N., Farrell R., Donahue J., & Darrell T.
Other Numbers3694
Abstract

n this paper, we address the problem of retrievingobjects based on open-vocabulary natural language queries:Given a phrase describing a specific object, e.g., “the corn flakes box”, the task is to find the best match in a set of imagescontaining candidate objects. When naming objects, humans tendto use natural language with rich semantics, including basic-levelcategories, fine-grained categories, and instance-level conceptssuch as brand names. Existing approaches to large-scale objectrecognition fail in this scenario, as they expect queries thatmap directly to a fixed set of pre-trained visual categories, e.g., ImageNet synset tags. We address this limitation by introducinga novel object retrieval method. Given a candidate object image,we first map it to a set of words that are likely to describe it,using several learned image-to-text projections. We also propose amethod for handling open-vocabularies, i.e., words not containedin the training data. We then compare the natural languagequery to the sets of words predicted for each candidate andselect the best match. Our method can combine category- andinstance-level semantics in a common representation. We presentextensive experimental results on several datasets using bothinstance-level and category-level matching and show that ourapproach can accurately retrieve objects based on extremelyvaried open-vocabulary queries. The source code of our approachwill be publicly available together with pre-trained models athttp://openvoc.berkeleyvision.organd could be directly used forrobotics applications.

Bibliographic Notes

Proceedings of the 10th Annual Conference on Robotics: Science and Systems (RSS X), Berkeley, California

Abbreviated Authors

S. Guadarrama, E. Rodner, K. Saenko, N. Zhang, R. Farrell, J. Donahue, and T. Darrell

ICSI Research Group

Vision

ICSI Publication Type

Article in conference proceedings