Thursday, May 4, 2017

Making Faces

Computer scientists have created the most accurate digital model of a human face. Here’s what it can do
Several faces randomly produced by the new 3D morphable model. James Booth, Imperial College London 

science mag  by Matthew Hutson May. 1, 2017 , 2:00 PM

If you’ve used the smartphone application Snapchat, you may have turned a photo of yourself into a disco bear or melded your face with someone else’s. Now, a group of researchers has created the most advanced technique yet for building 3D facial models on the computer. The system could improve personalized avatars in video games, facial recognition for security, and—of course—Snapchat filters.

When computers process faces, they sometimes rely on a so-called 3D morphable model (3DMM). The model represents an average face, but also contains information on common patterns of deviation from that average. For example, if you have a long nose, you’re also likely to have a long chin. Given such correlations, a computer can then characterize your unique face not by storing every point in a 3D scan, but by listing just a couple hundred numbers describing your deviation from an average face, including parameters that roughly correspond to age, gender, and length of face.

There’s a catch, however. To account for all the ways faces can vary, a 3DMM needs to integrate information on many faces. Until now that has required scanning lots of people and then painstakingly labeling all of their features. Consequently, the current best models are based on only a couple hundred people—mostly white adults—and have limited ability to model people of different ages and races.

Now, James Booth, a computer scientist at Imperial College London (ICL), and colleagues have developed a new method that automates the construction of 3DMMs and enables them to incorporate a wider spectrum of humanity. The method has three main steps. First, an algorithm automatically landmarks facial scans—labeling the tip of the nose and other points. Second, another algorithm lines up all the scans according to their landmarks and combines them into a model. Third, an algorithm detects and removes bad scans.

“The really big contribution in this work is they show how to fully automate this process,” says William Smith, who studies computer vision at the University of York in the United Kingdom and was not involved in the study. Labeling dozens of facial features on many faces is “pretty tedious,” says Alan Brunton, a computer scientist at the Fraunhofer Institute for Computer Graphics Research in Darmstadt, Germany, who was also uninvolved. “You think it’s relatively easy to click a point, but it’s not always obvious where the corner of the mouth really is, so even when you do this manually you have some error.”

But Booth and colleagues didn’t stop there. They applied their method to a set of nearly 10,000 demographically diverse facial scans. The scans were done at a science museum in London by the plastic surgeons Allan Ponniah and David Dunaway, who hoped to improve reconstructive surgery. They approached Stefanos Zafeiriou, a computer scientist at ICL for help analyzing the data. Applying the algorithm to those scans created what they call the “large scale facial model,” or LSFM. In tests against existing models, the LSFM much more accurately represented faces, the authors report in a forthcoming issue of the International Journal of Computer Vision. In one comparison, they created a model of a child’s face from a photograph. Using the LSFM, the model looked like the child. Using one of the most popular existing morphable models—which is based completely on adults—it looked like an unrelated grown-up. Booth and his colleagues even had enough scans to create more-specific morphable models for different races and ages. And their model can automatically classify faces into age groups based on shape.

Booth’s team has already put the new model to work. In another paper, the researchers use 100,000 faces synthesized by their LSFM to train an artificial intelligence program to turn casual 2D snapshots into accurate 3D models. The method could be used to view what a criminal suspect caught on camera would look like from a different angle, or 20 years older. One could also flesh out and animate historical figures from portraits.

The LSFM may soon have medical applications, too. If someone has lost a nose, the technology could help plastic surgeons determine how a new one should look, given the rest of the face. Facial scans have also been used to identify possible genetic diseases such as Williams syndrome, a condition associated with heart problems, developmental delays, and facial features such as a short nose and wide mouth. A better model of faces and their variation could enhance the sensitivity of such tests. The new model “opens several more doors,” Ponniah says.

One next step is to include facial expressions in the models, which would allow for recognition of faces in any form of grimace or sneer. Zafeiriou says they are currently back at the museum, scanning more visitors.

Reconstructing Detailed Dynamic Face Geometry from Momocular Video - SIGGRAPH Asia 2013
Christian Theobalt Published on Feb 26, 2016   5 min. 26 sec.

Detailed facial performance geometry can be reconstructed using dense camera and light setups in controlled studios. However, a wide range of important applications cannot employ these approaches, including all movie productions shot from a single principal camera. For post-production, these require dynamic monocular face capture for appearance modification. We present a new method for capturing face geometry from monocular video. Our approach captures detailed, dynamic, spatio-temporally coherent 3D face geometry without the need for markers. It works under uncontrolled lighting, and it successfully reconstructs expressive motion including high-frequency face detail such as folds and laugh lines. After simple manual initialization, the capturing process is fully automatic, which makes it versatile, lightweight and easy-to-deploy. Our approach tracks accurate sparse 2D features between automatically selected key frames to animate a parametric blend shape model, which is further refined in pose, expression and shape by temporally coherent optical flow and photometric stereo. We demonstrate performance capture results for long and complex face sequences captured indoors and outdoors, and we exemplify the relevance of our approach as an enabling technology for model-based face editing in movies and video, such as adding new facial textures, as well as a step towards enabling everyone to do facial performance capture with a single affordable camera.

P. Garrido, L. Valgaerts, C. Wu, C. Theobalt, Reconstructing Detailed Dynamic Face Geometry from Monocular Video, In ACM Transactions on Graphics (Proc. of SIGGRAPH Asia) 32, 158:1-158:10 (2013).

No comments: