Approaches for image-based articulated pose estimation and tracking typically only deal with pose, and model human body as a set of simple, and often known, geometric primitives (e.g. cylinders, ellipsoids, superquadrics). We argue that body shape is equally important. First, realistic body shape allows for richer generative models; second, shape information is also useful in estimation of various biometric measurements, like weight and height of the person.
In this talk we describe a fully automatic method for recovering detailed body shape and pose from monocular and multi-view imagery. We use a low-dimensional articulated human shape model, called SCAPE, which is able to account for shape variations across people as well as non-rigid pose specific shape variations. We show how parameters of this model can be tractably estimated from either monocular or multi-view imagery. To this end, we use a combination of generative and discriminative methods. Discriminative method is used to estimate the coarse pose and shape of the person observed in the image. A generative method is then used to locally refine these estimates to achieve a better fit to image observations. This two-stage process is effective in mediating the complexity of the inference task. As a result, we are able to automatically estimate parameters of the SCAPE model from images, and from this model compute a number of key biometric measurements.
Joint work with Alexandru Balan (Brown University) and Michael J. Black (Brown University).
