Kickstarting the Commons: The YFCC100M and the YLI Corpora

TitleKickstarting the Commons: The YFCC100M and the YLI Corpora
Publication TypeConference Paper
Year of Publication2015
AuthorsBernd, J., Borth D., Carrano C., Choi J., Elizalde B. Martinez, Friedland G., Gottlieb L., Ni K., Pearce R., Poland D., Ashraf K., Shamma D. A., & Thomee B.
Other Numbers3821

The publication of the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M)--to date the largest open-access collection of photos and videos--has provided a unique opportunity to stimulate new research in multimedia analysis and retrieval. To make the YFCC100M even more valuable, we have started working towards supplementing it with a comprehensive set of precomputed features and high-quality ground truth annotations. As part of our efforts, we are releasing the YLI feature corpus, as well as the YLI-GEO and YLI-MED annotation subsets. Under the Multimedia Commons Project (MMCP), we are currently laying the groundwork for a common platform and framework around the YFCC100M that (i) facilitates researchers in contributing additional features and annotations, (ii) supports experimentation on the dataset, and (iii) enables sharing of obtained results. This paper describes the YLI features and annotations released thus far, and sketches our vision for the MMCP.


Work on the YLI corpus and the Multimedia CommonsProject is supported by several funders, including: a collaborativeLaboratory Directed Research and Developmentproject led by Lawrence Livermore National Laboratory,under the auspices of the U.S. Dept. of Energy contractDE-AC52-07NA27344 (LLNL-CONF-676635); a grant fromCisco Systems, Inc. for Event Detection for ImprovedSpeaker Diarization and Meeting Analysis; and a NationalScience Foundation grant for the SMASH project: ScalableMultimedia content AnalysiS in a High-level language(award IIS : 1251276). Any opinions, findings, and conclusionsexpressed here are those of the individual researchers,and do not necessarily reflect the views of the funders.

Bibliographic Notes

Proceedings of the 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel Solutions (MMCommons '15), Brisbane, Australia, pp. 1-6

Abbreviated Authors

J. Bernd, D. Borth, C. Carrano, J. Choi, B. Elizalde, G. Friedland, L. Gottlieb, K. Ni, R. Pearce, D. Poland, K. Ashraf, D. A. Shamma, and B. Thomee

ICSI Research Group

Audio and Multimedia

ICSI Publication Type

Article in conference proceedings