ETHZ PASCAL Stickmen V 1.11
==========================

M. Eichner and V. Ferrari


Introduction
~~~~~~~~~~~~

2D Human Pose estimation is a very interesting problem for which only few common datasets with ground-truth annotation exist. We release here annotations for 549 images from the PASCAL VOC 2008 dataset [5]. The dataset consists mainly of amateur photographs with difficult illumination and low image quality. In each image one roughly upright, approximately frontal person is annotated by a 6-part stickman (i.e. one line segment for each part: head, torso, upper and lower arms). The annotated person is visible at least from the waist up.


Results on this dataset have been first published in [4]. The current release contains the results published in [6]. These are based on a better upper-body detector [7] then the one used in [4].


This dataset has no overlap with the VOC08/09/10 test sets. 


Contents
~~~~~~~~

This package contains:
  - raw image images from the PASCAL VOC 2008 dataset [5]
  - corresponding ground-truth stickmen annotations (referred to as 'GT stickmen' from now on)
  - matlab code to read and visualize GT stickmen
  - matlab code to evaluate stickmen estimated by an algorithm against GT stickmen
  - human pose estimation results from [6] 
    (i.e. all stickmen estimated by [6], along with the detection windows from [7]).
  - PCP performance curve from [6]


Let «dir_root» be the directory where this package was uncompressed.
The resulting sub-directories contain:

 «dir_root»/data - an annotation text file for a total of 549 images with one GT stickman each

 «dir_root»/code - Matlab code to read, display and evaluate annotations

 «dir_root»/images - the 549 images from PASCAL VOC 2008 [5] dataset which are annotated in this dataset              

 «dir_root»/overlays - same images, now with stickmen overlaid; 
                       these are useful for rapidly surfing the dataset, and for double checking
                       whether you have read the annotation text files correctly.


Quick start
~~~~~~~~~~~

You can follow the next steps to check that everything is properly set:

1) start matlab

2) navigate to «dir_root»/code (e.g. by using cd command)
   
3) execute command: startup
   This will add necessary paths to your matlab environment

4) if this is the first time you run the code, then execute installmex.
   This will compile the mex-files for your system.

5) execute the following to display the GT stickman from the first annotated image:
    img = imread('../images/2007_000423.jpg');
    lF = ReadStickmenAnnotationTxt('../data/pascal_sticks.txt');
    hdl = DrawStickman(lF(1).stickmen.coor, img); 
    
   check that a new figure is now open and it shows the same as the file
   '«dir_root»/code/2007_000423_stickman.jpg'

6) execute the following to recompute our best result from [6]:

    % loading ground-truth annotations
    GTPascal = ReadStickmenAnnotationTxt('../data/pascal_sticks.txt');
    % loading stickmen for the evaluation
    load('../techrep2010_ethzpascal_results.mat');
    % evaluating BMVC09 best system
    [detRate PCP] = BatchEval(@detBBFromStickmanPascal,@EvalStickmen,techrep2010_ethzpascal,GTPascal)

   You should obtain the following results: detRate = 0.7505, PCP = 0.6857

7) if all points above went well, this package is working perfectly.


8) reproducing PCP curve 

    calcPCPcurve(@detBBFromStickmanPascal,@EvalStickmen,techrep2010_ethzpascal,GTPascal,[],true);


Description of the annotation files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Each text file in «dir_root»/data contains annotations in the format:

«file_name_i»
 «x11»  «y11» «x12» «y12»
 «x21»  «y21» «x22» «y22» 
 «x31»  «y31» «x32» «y32» 
 «x41»  «y41» «x42» «y42» 
 «x51»  «y51» «x52» «y52» 
 «x61»  «y61» «x62» «y62»  
«file_name_i+1»
 «x11»  «y11» «x12» «y12»
 «x21»  «y21» «x22» «y22» 
 «x31»  «y31» «x32» «y32» 
 «x41»  «y41» «x42» «y42» 
 «x51»  «y51» «x52» «y52» 
 «x61»  «y61» «x62» «y62»  
 . . .
 . . .

where:
 - «file_name_i» is the file name of the i-th annotated image
   (you can check the corresponding image in: «dir_root»/overlays).
 - «xsp» is coordinate x for segment s (from 1 to 6) and end point p (1 or 2),
   the order of the segments corresponds to torso, left upper arm, right upper arm,
   left lower arm, right lower arm and head respectively ('left' and 'right' as in the image).


Matlab code
~~~~~~~~~~~

The following Matlab functions are provided:
 - ReadStickmenAnnotationTxt: reads an annotation file
 - DrawStickman: draws annotation for a single image
 - DirectEvalStickman: directly evaluates one estimated stickman against one GT stickman
 - EvalStickmen: evaluates all estimated stickmen for an image against the one GT stickman for that image
 - BatchEval: evaluates over an image set for a fixed pose estimation accuracy threshold 
 - calcPCPcurve: evaluates over an image set for a range of pose estimation accuracy thresholds (produces a PCP performance curve) 
 - DummyPascalPoseEstimationPipeline: dummy pose estimation routine that outputs data in the format required by BatchEval 


For exact input/output arguments format please type: help «function_name»


Evaluation criterion [4,6]
~~~~~~~~~~~~~~~~~~~~~~~~~~

For each image, BatchEval expects your system to provide a set of detected persons. Each detected person consists of an estimated window around the head and shoulders, as well as an estimated stickman. If you only provide the stickman, BatchEval will estimate such a window for you.

Given this information, BatchEval will compute two numbers:

a) Detection rate
indicates how many of the GT stickmen have been detected.
A GT stickmen is counted as detected if a window E estimated by your system overlaps more than 50% with the GT window G automatically derived from the GT stickman. The overlap measure is the area of intersection divided by the area of union between E and G. This is the PASCAL VOC criterion [5]. 


b) PCP (Percentage of Correctly estimated body Parts)
an estimated body part is counted as correct if its segment endpoints lie within t% of the length of the ground-truth segment from their annotated location. PCP is evaluated only for stickmen that have been detected (i.e. there is a correct detection window in the sense of point a). Overall performance is evaluated by a PCP-curve, obtained by varying the accuracy threshold t (calcPCPcurve.m). If you want to report a single number, then we recommend taking PCP at t=20% (strict) or t=50% (tolerant, this is the setting of BatchEval.m by default).

The ground-truth images contain exactly one ground-truth stickman each. Your system may detect multiple people in an image and therefore produce multiple estimated stickmen. BatchEval will automatically select the one matching with the GT stickman(i.e. the one whose detection window is correct), if there is one. If your system outputs multiple detections on the same person BatchEval will throw an error (this is invalid behavior in our protocol).

This evaluation protocol used in [6] is a stricter version of the one used in [4]. It is designed to prevent users from artificially achieving higher PCP scores by outputting multiple estimated stickman per person. The protocol in [4] allowed for multiple detections of the same person and counted the PCP of the highest scoring one. This is now banned from the official protocol. If you want to publish a comparison to our work, please cite our latest results from [6].

To obtain the total PCP over the whole test set, not only over persons with a correct detection window, please multiply PCP by the detection rate (i.e. multiple the two numbers output by BatchEval).


Clarification of the PCP evaluation criterion
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The matlab code to evaluate PCP provided with this dataset represents the official evaluation protocol for the following datasets: Buffy Stickmen, ETHZ PASCAL Stickmen, We Are Family Stickmen. In our PCP implementation, a body part produced by an algorithm is considered correctly localized if its endpoints are closer to their ground-truth locations than a threshold (on average over the two endpoints). Using it ensures results comparable to the vast majority of results previously reported on these dataset.

Recently an alternative implementation of the PCP criterion, based on a stricter interpretation of its description in Ferrari et al CVPR 2008 has been used in some works, including Johnson et al. BMVC 2010 and Pishchulin et al CVPR 2012. In this implementation, a body part is considered correct only if both of its endpoints are closer to their ground-truth locations than a threshold. These two different PCP measures are the consequence of the ambiguous wording in the original verbal description of PCP in Ferrari et al CVPR 2008 (which did not mention averaging over endpoints). Importantly, the stricter PCP version has essentially been used only on other datasets than the ones mentioned above, and in particular on IIP (Iterative Image Parsing dataset, Ramanan NIPS 2006) and LSP (Leeds Sports Pose dataset, Johnson et al. BMVC 2010).

In order to keep a healthy research environment and guarantee the comparability of results across different research groups and different years, we recommend the following policy:
- use our evaluation code, which computes the original, looser PCP measure, on Buffy Stickmen, ETHZ PASCAL Stickmen, We Are Family Stickmen, i.e. essentially on all datasets released by us
- some other datasets unfortunately have no official evaluation code released with them, and therefore it is harder to establish an exact and fully official protocol. Nonetheless, based on the protocols followed by most papers that appeared so far, we recommend using the strict PCP measure on IIP and LSP. A precise definition of the strict PCP measure can be found in Pishchulin et al CVPR 2012.

D. Ramanan. "Learning to Parse Images of Articulated Objects", In NIPS, 2006.
S. Johnson and M. Everingham "Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation", In BMVC, 2010 
L. Pishchulin, A. Jain, M. Andriluka, T. Thormaehlen and B. Schiele "Articulated People Detection and Pose Estimation: Reshaping the Future", In CVPR, 2012


Performance of [6] 
~~~~~~~~~~~~~~~~~~ 

For convenience we provide a figure containing PCP performance curves for our method in [6] in PNG format («dir_root»/PCP_techrep2010_Pascal.png) and as a Matlab figure («dir_root»/PCP_techrep2010_Pascal.fig).
The curve was obtained by varying the accuracy threshold t (as discussed above). The PCP curve allows to observe how well a system does as the threshold t gets tighter, i.e. only more and more accurate pose estimates are accepted. You can reproduce this curve with the calcPCPcurve routine (which loops over BatchEval for a range of thresholds t). 


Results of [6] 
~~~~~~~~~~~~~~ 

We also provide data structure containing our results from [6]. It is stored in this mat-file: 

«dir_root»/techrep2010_ethzpascal_results.mat 

Load it in Matlab (version 7 or later) by executing this command: 

load('«dir_root»/techrep2010_ethzpascal_results.mat') 

Results are provided for our best system from [6]: 

techrep2010_ethzpascal: (1x549 struct) 
    .filename: image filename 
    .stickmen: (1xD struct) of results for .filename (D = number of detection windows in this image) 
        .coor: 4x6 array of sticks coordinates in the same order as GT (see above) 
        .det: [minx miny maxx maxy] coordinates of the detection window 


This data can be used to reproduce our performance curve (using the calcPCPcurve routine). The detection windows provided along with the estimated stickmen were obtained using the Calvin upper-body detector [7]. You might want to use these detection windows as an input to your own human pose estimator, to ensure an exact comparison to [6] in terms of PCP. 


Pose estimation prototype routine
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Additionally, we provide a dummy pose estimation routine (DummyPascalPoseEstimationPipeline) that
outputs data in the format required by BatchEval. This is included to demonstrate how to produce
data formatted for BatchEval.

To fully understand the structure of this input data, we recommend you look into the variables in techrep2010_ethzpascal_results.mat.

% example use:
Dummy = DummyPascalPoseEstimationPipeline('../images');
[detRate PCP] = BatchEval(@detBBFromStickmanPascal,@EvalStickmen,Dummy,GTPascal)

this will produce some low random values (around 5-10%) for both detRate and PCP (at t=50%).


Support
~~~~~~~

For any query/suggestion/complaint or simply to say you like/use the annotation and software just drop us an email

eichner@vision.ee.ethz.ch
vferrari@staffmail.ed.ac.uk


References
~~~~~~~~~~

[1] Progressive search space reduction for pose estimation 
Vittorio Ferrari, M.J. Marin-Jimenez and Andrew Zisserman 
Proceedings of IEEE Conference in Computer Vision and Pattern Recognition, June 2008.

[2] 2D Human Pose Estimation in TV Shows
Vittorio Ferrari, M.J. Marin-Jimenez and Andrew Zisserman
International Dagstuhl Seminar, Dagstuhl, Germany, July 2008. 

[3] Pose search: retrieving people using their pose
Vittorio Ferrari, M.J. Marin-Jimenez and Andrew Zisserman
Proceedings of IEEE Conference in Computer Vision and Pattern Recognition, June 2009.

[4] Better appearance models for pictorial structures
Marcin Eichner and Vittorio Ferrari
British Machine Vision Conference, September 2009.

[5] The PASCAL Visual Object Classes Challenge 2008 (VOC2008) Results.
M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman.
http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html, 2008.


[6] 2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images 
M.Eichner, M. Marin-Jimenez, A. Zisserman, V.Ferrari 
International Journal of Computer Vision, 2012

[7] Calvin upper-body detector  
http://www.vision.ee.ethz.ch/~calvin/calvin_upperbody_detector/ 


Version History
~~~~~~~~~~~~~~~

Version V 1.11
--------------
- PCP clarification incorporated into this readme file

Version 1.1 
- results updated to [6] 

Version 1.0
-----------
- initial release (containing results from [4])