Face Alignment

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 16902 Experts worldwide ranked by ideXlab platform

Georgios Tzimiropoulos - One of the best experts on this subject based on the ideXlab platform.

  • binarized convolutional landmark localizers for human pose estimation and Face Alignment with limited resources
    International Conference on Computer Vision, 2017
    Co-Authors: Adrian Bulat, Georgios Tzimiropoulos
    Abstract:

    Our goal is to design architectures that retain the groundbreaking performance of CNNs for landmark localization and at the same time are lightweight, compact and suitable for applications with limited computational resources. To this end, we make the following contributions: (a) we are the first to study the effect of neural network binarization on localization tasks, namely human pose estimation and Face Alignment. We exhaustively evaluate various design choices, identify performance bottlenecks, and more importantly propose multiple orthogonal ways to boost performance. (b) Based on our analysis, we propose a novel hierarchical, parallel and multi-scale residual architecture that yields large performance improvement over the standard bottleneck block while having the same number of parameters, thus bridging the gap between the original network and its binarized counterpart. (c) We perform a large number of ablation studies that shed light on the properties and the performance of the proposed block. (d) We present results for experiments on the most challenging datasets for human pose estimation and Face Alignment, reporting in many cases state-of-the-art performance. Code can be downloaded from https://www.adrianbulat.com/binary-cnn-landmarks

  • how far are we from solving the 2d 3d Face Alignment problem and a dataset of 230 000 3d facial landmarks
    International Conference on Computer Vision, 2017
    Co-Authors: Adrian Bulat, Georgios Tzimiropoulos
    Abstract:

    This paper investigates how far a very deep neural network is from attaining close to saturating performance on existing 2D and 3D Face Alignment datasets. To this end, we make the following 5 contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and finally evaluate it on all other 2D facial landmark datasets. (b)We create a guided by 2D landmarks network which converts 2D landmark annotations to 3D and unifies all existing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date (~230,000 images). (c) Following that, we train a neural network for 3D Face Alignment and evaluate it on the newly introduced LS3D-W. (d) We further look into the effect of all “traditional” factors affecting Face Alignment performance like large pose, initialization and resolution, and introduce a “new” one, namely the size of the network. (e) We show that both 2D and 3D Face Alignment networks achieve performance of remarkable accuracy which is probably close to saturating the datasets used. Training and testing code as well as the dataset can be downloaded from https://www.adrianbulat.com/Face-Alignment/

  • how far are we from solving the 2d 3d Face Alignment problem and a dataset of 230 000 3d facial landmarks
    arXiv: Computer Vision and Pattern Recognition, 2017
    Co-Authors: Adrian Bulat, Georgios Tzimiropoulos
    Abstract:

    This paper investigates how far a very deep neural network is from attaining close to saturating performance on existing 2D and 3D Face Alignment datasets. To this end, we make the following 5 contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and finally evaluate it on all other 2D facial landmark datasets. (b) We create a guided by 2D landmarks network which converts 2D landmark annotations to 3D and unifies all existing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date ~230,000 images. (c) Following that, we train a neural network for 3D Face Alignment and evaluate it on the newly introduced LS3D-W. (d) We further look into the effect of all "traditional" factors affecting Face Alignment performance like large pose, initialization and resolution, and introduce a "new" one, namely the size of the network. (e) We show that both 2D and 3D Face Alignment networks achieve performance of remarkable accuracy which is probably close to saturating the datasets used. Training and testing code as well as the dataset can be downloaded from this https URL

  • convolutional aggregation of local evidence for large pose Face Alignment
    British Machine Vision Conference, 2016
    Co-Authors: Adrian Bulat, Georgios Tzimiropoulos
    Abstract:

    Methods for unconstrained Face Alignment must satisfy two requirements: they must not rely on accurate initialisation/Face detection and they should perform equally well for the whole spectrum of facial poses. To the best of our knowledge, there are no methods meeting these requirements to satisfactory extent, and in this paper, we propose Convolutional Aggregation of Local Evidence (CALE), a Convolutional Neural Network (CNN) architecture particularly designed for addressing both of them. In particular, to remove the requirement for accurate Face detection, our system firstly performs facial part detection, providing confidence scores for the location of each of the facial landmarks (local evidence). Next, these score maps along with early CNN features are aggregated by our system through joint regression in order to refine the landmarks’ location. Besides playing the role of a graphical model, CNN regression is a key feature of our system, guiding the network to rely on context for predicting the location of occluded landmarks, typically encountered in very large poses. The whole system is trained end-to-end with intermediate supervision. When applied to AFLW-PIFA, the most challenging human Face Alignment test set to date, our method provides more than 50% gain in localisation accuracy when compared to other recently published methods for large pose Face Alignment. Going beyond human Faces, we also demonstrate that CALE is effective in dealing with very large changes in shape and appearance, typically encountered in animal Faces.

  • gauss newton deformable part models for Face Alignment in the wild
    Computer Vision and Pattern Recognition, 2014
    Co-Authors: Georgios Tzimiropoulos, Maja Pantic
    Abstract:

    Arguably, Deformable Part Models (DPMs) are one of the most prominent approaches for Face Alignment with impressive results being recently reported for both controlled lab and unconstrained settings. Fitting in most DPM methods is typically formulated as a two-step process during which discriminatively trained part templates are first correlated with the image to yield a filter response for each landmark and then shape optimization is performed over these filter responses. This process, although computationally efficient, is based on fixed part templates which are assumed to be independent, and has been shown to result in imperfect filter responses and detection ambiguities. To address this limitation, in this paper, we propose to jointly optimize a part-based, trained in-the-wild, flexible appearance model along with a global shape model which results in a joint translational motion model for the model parts via Gauss-Newton (GN) optimization. We show how significant computational reductions can be achieved by building a full model during training but then efficiently optimizing the proposed cost function on a sparse grid using weighted least-squares during fitting. We coin the proposed formulation Gauss-Newton Deformable Part Model (GN-DPM). Finally, we compare its performance against the state-of-the-art and show that the proposed GN-DPM outperforms it, in some cases, by a large margin. Code for our method is available from http://ibug.doc.ic.ac.uk/resources

Adrian Bulat - One of the best experts on this subject based on the ideXlab platform.

  • binarized convolutional landmark localizers for human pose estimation and Face Alignment with limited resources
    International Conference on Computer Vision, 2017
    Co-Authors: Adrian Bulat, Georgios Tzimiropoulos
    Abstract:

    Our goal is to design architectures that retain the groundbreaking performance of CNNs for landmark localization and at the same time are lightweight, compact and suitable for applications with limited computational resources. To this end, we make the following contributions: (a) we are the first to study the effect of neural network binarization on localization tasks, namely human pose estimation and Face Alignment. We exhaustively evaluate various design choices, identify performance bottlenecks, and more importantly propose multiple orthogonal ways to boost performance. (b) Based on our analysis, we propose a novel hierarchical, parallel and multi-scale residual architecture that yields large performance improvement over the standard bottleneck block while having the same number of parameters, thus bridging the gap between the original network and its binarized counterpart. (c) We perform a large number of ablation studies that shed light on the properties and the performance of the proposed block. (d) We present results for experiments on the most challenging datasets for human pose estimation and Face Alignment, reporting in many cases state-of-the-art performance. Code can be downloaded from https://www.adrianbulat.com/binary-cnn-landmarks

  • how far are we from solving the 2d 3d Face Alignment problem and a dataset of 230 000 3d facial landmarks
    International Conference on Computer Vision, 2017
    Co-Authors: Adrian Bulat, Georgios Tzimiropoulos
    Abstract:

    This paper investigates how far a very deep neural network is from attaining close to saturating performance on existing 2D and 3D Face Alignment datasets. To this end, we make the following 5 contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and finally evaluate it on all other 2D facial landmark datasets. (b)We create a guided by 2D landmarks network which converts 2D landmark annotations to 3D and unifies all existing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date (~230,000 images). (c) Following that, we train a neural network for 3D Face Alignment and evaluate it on the newly introduced LS3D-W. (d) We further look into the effect of all “traditional” factors affecting Face Alignment performance like large pose, initialization and resolution, and introduce a “new” one, namely the size of the network. (e) We show that both 2D and 3D Face Alignment networks achieve performance of remarkable accuracy which is probably close to saturating the datasets used. Training and testing code as well as the dataset can be downloaded from https://www.adrianbulat.com/Face-Alignment/

  • how far are we from solving the 2d 3d Face Alignment problem and a dataset of 230 000 3d facial landmarks
    arXiv: Computer Vision and Pattern Recognition, 2017
    Co-Authors: Adrian Bulat, Georgios Tzimiropoulos
    Abstract:

    This paper investigates how far a very deep neural network is from attaining close to saturating performance on existing 2D and 3D Face Alignment datasets. To this end, we make the following 5 contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and finally evaluate it on all other 2D facial landmark datasets. (b) We create a guided by 2D landmarks network which converts 2D landmark annotations to 3D and unifies all existing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date ~230,000 images. (c) Following that, we train a neural network for 3D Face Alignment and evaluate it on the newly introduced LS3D-W. (d) We further look into the effect of all "traditional" factors affecting Face Alignment performance like large pose, initialization and resolution, and introduce a "new" one, namely the size of the network. (e) We show that both 2D and 3D Face Alignment networks achieve performance of remarkable accuracy which is probably close to saturating the datasets used. Training and testing code as well as the dataset can be downloaded from this https URL

  • convolutional aggregation of local evidence for large pose Face Alignment
    British Machine Vision Conference, 2016
    Co-Authors: Adrian Bulat, Georgios Tzimiropoulos
    Abstract:

    Methods for unconstrained Face Alignment must satisfy two requirements: they must not rely on accurate initialisation/Face detection and they should perform equally well for the whole spectrum of facial poses. To the best of our knowledge, there are no methods meeting these requirements to satisfactory extent, and in this paper, we propose Convolutional Aggregation of Local Evidence (CALE), a Convolutional Neural Network (CNN) architecture particularly designed for addressing both of them. In particular, to remove the requirement for accurate Face detection, our system firstly performs facial part detection, providing confidence scores for the location of each of the facial landmarks (local evidence). Next, these score maps along with early CNN features are aggregated by our system through joint regression in order to refine the landmarks’ location. Besides playing the role of a graphical model, CNN regression is a key feature of our system, guiding the network to rely on context for predicting the location of occluded landmarks, typically encountered in very large poses. The whole system is trained end-to-end with intermediate supervision. When applied to AFLW-PIFA, the most challenging human Face Alignment test set to date, our method provides more than 50% gain in localisation accuracy when compared to other recently published methods for large pose Face Alignment. Going beyond human Faces, we also demonstrate that CALE is effective in dealing with very large changes in shape and appearance, typically encountered in animal Faces.

Jian Sun - One of the best experts on this subject based on the ideXlab platform.

  • Face Alignment via regressing local binary features
    IEEE Transactions on Image Processing, 2016
    Co-Authors: Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun
    Abstract:

    This paper presents a highly efficient and accurate regression approach for Face Alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the Face detector used to initialize Alignment. We investigate several Face detectors and perform quantitative evaluation on how they affect Alignment accuracy. We find that an Alignment friendly detector can further greatly boost the accuracy of our Alignment method, reducing the error up to 16% relatively. To facilitate practical usage of Face detection/Alignment methods, we also propose a convenient metric to measure how good a detector is for Alignment initialization.

  • joint cascade Face detection and Alignment
    European Conference on Computer Vision, 2014
    Co-Authors: Dong Chen, Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun
    Abstract:

    We present a new state-of-the-art approach for Face detection. The key idea is to combine Face Alignment with detection, observing that aligned Face shapes provide better features for Face classification. To make this combination more effective, our approach learns the two tasks jointly in the same cascade framework, by exploiting recent advances in Face Alignment. Such joint learning greatly enhances the capability of cascade detection and still retains its realtime performance. Extensive experiments show that our approach achieves the best accuracy on challenging datasets, where all existing solutions are either inaccurate or too slow.

  • Face Alignment at 3000 fps via regressing local binary features
    Computer Vision and Pattern Recognition, 2014
    Co-Authors: Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun
    Abstract:

    This paper presents a highly efficient, very accurate regression approach for Face Alignment. Our approach has two novel components: a set of local binary features, and a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. Our approach achieves the state-of-the-art results when tested on the current most challenging benchmarks. Furthermore, because extracting and regressing local binary features is computationally very cheap, our system is much faster than previous methods. It achieves over 3, 000 fps on a desktop or 300 fps on a mobile phone for locating a few dozens of landmarks.

Shaoqing Ren - One of the best experts on this subject based on the ideXlab platform.

  • Face Alignment via regressing local binary features
    IEEE Transactions on Image Processing, 2016
    Co-Authors: Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun
    Abstract:

    This paper presents a highly efficient and accurate regression approach for Face Alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the Face detector used to initialize Alignment. We investigate several Face detectors and perform quantitative evaluation on how they affect Alignment accuracy. We find that an Alignment friendly detector can further greatly boost the accuracy of our Alignment method, reducing the error up to 16% relatively. To facilitate practical usage of Face detection/Alignment methods, we also propose a convenient metric to measure how good a detector is for Alignment initialization.

  • joint cascade Face detection and Alignment
    European Conference on Computer Vision, 2014
    Co-Authors: Dong Chen, Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun
    Abstract:

    We present a new state-of-the-art approach for Face detection. The key idea is to combine Face Alignment with detection, observing that aligned Face shapes provide better features for Face classification. To make this combination more effective, our approach learns the two tasks jointly in the same cascade framework, by exploiting recent advances in Face Alignment. Such joint learning greatly enhances the capability of cascade detection and still retains its realtime performance. Extensive experiments show that our approach achieves the best accuracy on challenging datasets, where all existing solutions are either inaccurate or too slow.

  • Face Alignment at 3000 fps via regressing local binary features
    Computer Vision and Pattern Recognition, 2014
    Co-Authors: Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun
    Abstract:

    This paper presents a highly efficient, very accurate regression approach for Face Alignment. Our approach has two novel components: a set of local binary features, and a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. Our approach achieves the state-of-the-art results when tested on the current most challenging benchmarks. Furthermore, because extracting and regressing local binary features is computationally very cheap, our system is much faster than previous methods. It achieves over 3, 000 fps on a desktop or 300 fps on a mobile phone for locating a few dozens of landmarks.

Zhiwen Shao - One of the best experts on this subject based on the ideXlab platform.

  • deep adaptive attention for joint facial action unit detection and Face Alignment
    European Conference on Computer Vision, 2018
    Co-Authors: Zhiwen Shao
    Abstract:

    Facial action unit (AU) detection and Face Alignment are two highly correlated tasks since facial landmarks can provide precise AU locations to facilitate the extraction of meaningful local features for AU detection. Most existing AU detection works often treat Face Alignment as a preprocessing and handle the two tasks independently. In this paper, we propose a novel end-to-end deep learning framework for joint AU detection and Face Alignment, which has not been explored before. In particular, multi-scale shared features are learned firstly, and high-level features of Face Alignment are fed into AU detection. Moreover, to extract precise local features, we propose an adaptive attention learning module to refine the attention map of each AU adaptively. Finally, the assembled local features are integrated with Face Alignment features and global features for AU detection. Experiments on BP4D and DISFA benchmarks demonstrate that our framework significantly outperforms the state-of-the-art methods for AU detection.