ETHZ PASCAL Stickmen V1.11
Annotated data and evaluation routines for 2D human pose estimation
Marcin Eichner, Vittorio Ferrari
Overview
Dataset sticks distribution
We release here annotations for 549 images from the PASCAL VOC 2008 trainval release [2]. The dataset consists mainly of amateur photographs with difficult illumination and low image quality. In each image one roughly upright and approximately frontal person is annotated by a 6-part stickman (i.e. one line segment indicating location, size and orientation for each part: head, torso, upper and lower arms). The annotated person is visible at least from the waist up. Results on this dataset have been first published in [1].
NOTE: this dataset has no overlap with VOC08/09/10 test sets.
In addition, the package includes official Matlab routines to evaluate the performance of your pose estimation system on this dataset and compare to our results from [3].
On the right, the scatter plot inspired by [5] depicts pose variability over this dataset. Stickmen are centered on the neck and scale normalized. Hence the plot captures only pose variability and does not show scale and location variability.
Clarification of the PCP evaluation criterion
The matlab code to evaluate PCP provided with this dataset represents the official evaluation protocol for the following datasets: Buffy Stickmen, ETHZ PASCAL Stickmen, We Are Family Stickmen. In our PCP implementation, a body part produced by an algorithm is considered correctly localized if its endpoints are closer to their ground-truth locations than a threshold (on average over the two endpoints). Using it ensures results comparable to the vast majority of results previously reported on these dataset.
Recently an alternative implementation of the PCP criterion, based on a stricter interpretation of its description in Ferrari et al CVPR 2008 has been used in some works, including Johnson et al. BMVC 2010 and Pishchulin et al CVPR 2012. In this implementation, a body part is considered correct only if both of its endpoints are closer to their ground-truth locations than a threshold. These two different PCP measures are the consequence of the ambiguous wording in the original verbal description of PCP in Ferrari et al CVPR 2008 (which did not mention averaging over endpoints). Importantly, the stricter PCP version has essentially been used only on other datasets than the ones mentioned above, and in particular on IIP (Iterative Image Parsing dataset, Ramanan NIPS 2006) and LSP (Leeds Sports Pose dataset, Johnson et al. BMVC 2010).
In order to keep a healthy research environment and guarantee the comparability of results across different research groups and different years, we recommend the following policy:
- use our evaluation code, which computes the original, looser PCP measure, on Buffy Stickmen, ETHZ PASCAL Stickmen, We Are Family Stickmen, i.e. essentially on all datasets released by us
- some other datasets unfortunately have no official evaluation code released with them, and therefore it is harder to establish an exact and fully official protocol. Nonetheless, based on the protocols followed by most papers that appeared so far, we recommend using the strict PCP measure on IIP and LSP. A precise definition of the strict PCP measure can be found in Pishchulin et al CVPR 2012.
D. Ramanan. "Learning to Parse Images of Articulated Objects", In NIPS, 2006.
S. Johnson and M. Everingham "Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation", In BMVC, 2010
L. Pishchulin, A. Jain, M. Andriluka, T. Thormaehlen and B. Schiele "Articulated People Detection and Pose Estimation: Reshaping the Future", In CVPR, 2012
History
new in v1.11:
- PCP clarification incorporated into README.txt
new in v1.1:
- results updated to [3], based on the detection windows obtained using [4]
Downloads
Filename | Description | Size |
---|---|---|
ETHZ_PASCAL_Stickmen_v1.11.tgz | annotations for the included frames and matlab code to read and display the annotations and evaluate pose estimation performance. | 79 MB |
README.txt | description of contents. | 14 kB |
PCP_techrep2010_Pascal.png | plot showing pose estimation performance of [3] on this dataset | 13 kB |
PCP_techrep2010_Pascal.fig | Matlab figure plot. You can overlay your performance curve on this plot in order to compare to our results from [3] | 18 kB |
Related Publications
[1] Eichner, M. and Ferrari, V.
Better Appearance Models for Pictorial Structures
Proceedings of British Machine Vision Conference (BMVC), 2009.
Document: PDF
[2] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman.
The PASCAL Visual Object Classes Challenge 2008 (VOC2008) Results.
Webpage, 2008.
[3] M.Eichner, M. Marin-Jimenez, A. Zisserman, V.Ferrari
2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images
International Journal of Computer Vision, 2012
Document: PDF
[4]
Calvin upper-body detector
Webpage, 2010.
[5] D.Tran, D.Forsyth
Improved Human Parsing with a Full Relational Model
Proceedings of European Conference on Computer Vision (ECCV) 2010
Acknowledgements
This work is funded the Swiss National Science Foundation SNSF