Multimodal Location Estimation of Videos and Images
Title | Multimodal Location Estimation of Videos and Images |
Publication Type | Book |
Year of Publication | 2015 |
Series Editor | Choi, J., & Friedland G. |
Other Numbers | 3744 |
Abstract | This book presents an overview of the field of multimodal location estimation, i.e. using acoustic, visual, and/or textual cues to estimate the shown location of a video recording. The authors' sample research results in this field in a unified way integrating research work on this topic that focuses on different modalities, viewpoints, and applications. The book describes fundamental methods of acoustic, visual, textual, social graph, and metadata processing as well as multimodal integration methods used for location estimation. In addition, the text covers benchmark metrics and explores the limits of the technology based on a human baseline.
|
Acknowledgment | Initial experiments of this work were supported by an NGA NURI grant \#HM11582-10-1-0008 and Korean Foundation for Advanced Studies. Later work was supported by the National Science Foundation through grants CNS: 1065240 ("Understanding and Managing the Impact of Global Inference on Online Privacy") and IIS : 1251276 ("SMASH -- Scalable Multimedia content AnalysiS in a High-level language"). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors or originators and do not necessarily reflect the views of the NGA or the NSF. |
Bibliographic Notes | Springer |
Abbreviated Authors | J. Choi and G. Friedland, eds. |
ICSI Research Group | Audio and Multimedia |
ICSI Publication Type | Book |