FAN-Large

A Large-Scale Database of Images and Captions for automatic face naming

(This dataset is temporarily unavailable.)

Mert Ozcan, Luo Jie, Vittorio Ferrari, Barbara Caputo

Overview

FAN-Large contains over 125.000 images with accompanying text captions. It is designed as a resource for testing algorithms to learn visual models from weakly supervised data. We collected the images by querying Google Image with combination of celebrity names and verbs corresponding to distinct upper-body poses. In [1] we give detailed statistics of the dataset and present an evaluation of several name-face association algorithms on it.

In addition to the image-caption pairs, this release also includes:

ground-truth association of faces to names
ground-truth association of body poses to verbs
names extracted automatically from the captions using [2]
face bounding-boxes detected automatically from the images using [3,4]. These facilitate a direct comparison to our results.
Auxiliary programs and scripts we have used to collect and evaluate FAN-Large.

Important Notice

These images were downloaded from the internet, and may subject to copyright. We don't own the copyright of the images and only provide them for non-commercial research purposes.

Downloads

FAN-Large uses a lot of disk space. We partitioned the dataset and the accompanying files into smaller chunks for a more convenient download. The tarballs should be extracted in the same folder to obtain the proper directory structure.

Filename	Description	Release Date	Size
images.tar.gz	Images (alternatively you can download them in parts and concatenate them as explained below: P1 P2 P3 P4 P5)	19 September 2011	18 GB
captions.tar.gz	Captions in text format	19 September 2011	42 MB
bbx.tar.gz	Detected face bounding-boxes in text format.	19 September 2011	16 MB
face_features.tar.gz	Face features extracted using [3,4] in text format	19 September 2011	2.0 GB
mturk_annotations.tar.gz	Ground-truth annotations in text format.	19 September 2011	9.0 MB
persons.tar.gz	Names detected in the caption using [2]	19 September 2011	17 MB
DatasetXML.tar.gz	XML files holding the information of the dataset folder structure. They can be used for conveniently parsing the dataset.	19 September 2011	23 MB
README	README files for the dataset folder structure. Additionally, each tarball in the list above has its own README describing its specific contents.	19 September 2011	3.4 KB

For convenience, we also provide the dataset (excluding the images) in MATLAB format (.mat). This way the dataset can be used in MATLAB without the need to parse the text files.

Filename	Description	Release Date	Size
dataset_matlab.tar.gz	The dataset and the information extracted (e.g. face and name detections) in matlab file format.	19 September 2011	2.7 GB

We also provide some auxiliary files that can be used with FAN-Large. These files include: the webcrawler we used to download the data, a GUI to display the data, scripts to extract captions from the html files, face and name detectors, name-face association algorithms, and scripts used to process the Amazon MT annotations.

Filename	Description	Release Date	Size
auxilliary.tar.gz	Auxiliary files for the FAN-Large dataset	19 September 2011	146 MB

Related Publications and Softwares

[1] M. Ozcan, L. Jie, V. Ferrari and B. Caputo.
A Large-Scale Database of Images and Captions for Automatic Face Naming
British Machine Vision Conference (BMVC), 2011.

[2] http://opennlp.sourceforge.net

[3] http://torch3vision.idiap.ch/

[4] http://torch3vision.idiap.ch/

Acknowledgements

This work was done while Mert Ozcan was an intern at the Idiap Research Institute. Luo Jie was supported by PASCAL Pump Priming SS2-Rob Project, Vittorio Ferrari was supported by a SNSF Professorship. Barbara Caputo was supported by the SNSF project NINAPRO.

Idiap Research Institute ETH Zurich, CALVIN group