Multimodal Location Estimation of Consumer Media Dealing with Sparse Training Data
Title | Multimodal Location Estimation of Consumer Media Dealing with Sparse Training Data |
Publication Type | Conference Paper |
Year of Publication | 2012 |
Authors | Choi, J., Friedland G., Ekambaram V., & Ramchandran K. |
Page(s) | 43-48 |
Other Numbers | 3274 |
Abstract | This article describes a novel approach to the problem of associating geo-locations to consumer-produced multimedia data such as videos and photos that are publicly available on social networking websites such as Flickr. We specifically focus on the case where the available training data is sparse both in absolute numbers as well as geographic coverage when compared to the number of untagged query data. We develop a novel graphical model based framework for the problem of interest and pose the problem of geotagging as one of inference over this graph. The novelty of our algorithm lies in the fact that we jointly estimate the geo-locations of all the query videos, which helps obtain performance improvements over existing algorithms in the literature that process each query video independently. Our system enables the query videos to act as virtual training data that effectively bootstrap the geo-tagging process. The quality of the database improves with each additional query video in the system. Further, our modeling provides a generic theoretical framework that can be used to incorporate any other available textual, visual or audio features. We evaluate our algorithm on the MediaEval 2011 Placing Task data set and show that for fixed training data the system performance improves with an increasing number of unlabeled test data. The performance gains are shown to be over 10% as compared to existing algorithms in the literature. |
URL | http://www.icsi.berkeley.edu/pubs/speech/multimodallocation12.pdf |
Bibliographic Notes | Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 2012), Melbourne, Australia, pp. 43-48 |
Abbreviated Authors | J. Choi, G. Friedland, V. Ekambaram, and K. Ramchandran |
ICSI Research Group | Audio and Multimedia |
ICSI Publication Type | Article in conference proceedings |