Stereo Matching

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 17679 Experts worldwide ranked by ideXlab platform

Li Zhang - One of the best experts on this subject based on the ideXlab platform.

  • PMSC: PatchMatch-Based Superpixel Cut for Accurate Stereo Matching
    'Institute of Electrical and Electronics Engineers (IEEE)', 2019
    Co-Authors: Li Lincheng, Zhang Shunli, Yu Xin, Li Zhang
    Abstract:

    Estimating the disparity and normal direction of one pixel simultaneously, instead of only disparity, also known as 3D label methods, can achieve much higher subpixel accuracy in the Stereo Matching problem. However, it is extremely difficult to assign an appropriate 3D label to each pixel from the continuous label space R3 while maintaining global consistency because of the infinite parameter space. In this paper, we propose a novel algorithm called PatchMatch-based superpixel cut to assign 3D labels of an image more accurately. In order to achieve robust and precise Stereo Matching between local windows, we develop a bilayer Matching cost, where a bottom-up scheme is exploited to design the two layers. The bottom layer is employed to measure the similarity between small square patches locally by exploiting a pretrained convolutional neural network, and then, the top layer is developed to assemble the local Matching costs in large irregular windows induced by the tangent planes of object surfaces. To optimize the spatial smoothness of local assignments, we propose a novel strategy to update 3D labels. In the procedure of optimization, both segmentation information and random refinement of PatchMatch are exploited to update candidate 3D label set for each pixel with high probability of achieving lower loss. Since pairwise energy of general candidate label sets violates the submodular property of graph cut, we propose a novel multilayer superpixel structure to group candidate label sets into candidate assignments, which thereby can be efficiently fused by α-expansion graph cut. Extensive experiments demonstrate that our method can achieve higher subpixel accuracy in different data sets, and currently ranks first on the new challenging Middlebury 3.0 benchmark among all the existing methods.This work was supported by the National Natural Science Foundation of China under Grant 61172125, Grant 61132007, and Grant U153313

  • pmsc patchmatch based superpixel cut for accurate Stereo Matching
    IEEE Transactions on Circuits and Systems for Video Technology, 2018
    Co-Authors: Shunli Zhang, Li Zhang
    Abstract:

    Estimating the disparity and normal direction of one pixel simultaneously, instead of only disparity, also known as 3D label methods, can achieve much higher subpixel accuracy in the Stereo Matching problem. However, it is extremely difficult to assign an appropriate 3D label to each pixel from the continuous label space $\mathbb {R}^{3}$ while maintaining global consistency because of the infinite parameter space. In this paper, we propose a novel algorithm called PatchMatch-based superpixel cut to assign 3D labels of an image more accurately. In order to achieve robust and precise Stereo Matching between local windows, we develop a bilayer Matching cost, where a bottom–up scheme is exploited to design the two layers. The bottom layer is employed to measure the similarity between small square patches locally by exploiting a pretrained convolutional neural network, and then, the top layer is developed to assemble the local Matching costs in large irregular windows induced by the tangent planes of object surfaces. To optimize the spatial smoothness of local assignments, we propose a novel strategy to update 3D labels. In the procedure of optimization, both segmentation information and random refinement of PatchMatch are exploited to update candidate 3D label set for each pixel with high probability of achieving lower loss. Since pairwise energy of general candidate label sets violates the submodular property of graph cut, we propose a novel multilayer superpixel structure to group candidate label sets into candidate assignments, which thereby can be efficiently fused by $\alpha $ -expansion graph cut. Extensive experiments demonstrate that our method can achieve higher subpixel accuracy in different data sets, and currently ranks first on the new challenging Middlebury 3.0 benchmark among all the existing methods.

  • a locally linear regression model for boundary preserving regularization in Stereo Matching
    European Conference on Computer Vision, 2012
    Co-Authors: Shengqi Zhu, Li Zhang, Hailin Jin
    Abstract:

    We propose a novel regularization model for Stereo Matching that uses large neighborhood windows. The model is based on the observation that in a local neighborhood there exists a linear relationship between pixel values and disparities. Compared to the traditional boundary preserving regularization models that use adjacent pixels, the proposed model is robust to image noise and captures higher level interactions. We develop a globally optimized Stereo Matching algorithm based on this regularization model. The algorithm alternates between finding a quadratic upper bound of the relaxed energy function and solving the upper bound using iterative reweighted least squares. To reduce the chance of being trapped in local minima, we propose a progressive convex-hull filter to tighten the data cost relaxation. Our evaluation on the Middlebury datasets shows the effectiveness of our method in preserving boundary sharpness while keeping regions smooth. We also evaluate our method on a wide range of challenging real-world videos. Experimental results show that our method outperforms existing methods in temporal consistency.

  • Stereo Matching with nonparametric smoothness priors in feature space
    Computer Vision and Pattern Recognition, 2009
    Co-Authors: Brandon M Smith, Li Zhang, Hailin Jin
    Abstract:

    We propose a novel formulation of Stereo Matching that considers each pixel as a feature vector. Under this view, Matching two or more images can be cast as Matching point clouds in feature space. We build a nonparametric depth smoothness model in this space that correlates the image features and depth values. This model induces a sparse graph that links pixels with similar features, thereby converting each point cloud into a connected network. This network defines a neighborhood system that captures pixel grouping hierarchies without resorting to image segmentation. We formulate global Stereo Matching over this neighborhood system and use graph cuts to match pixels between two or more such networks. We show that our Stereo formulation is able to recover surfaces with different orders of smoothness, such as those with high-curvature details and sharp discontinuities. Furthermore, compared to other single-frame Stereo methods, our method produces more temporally stable results from videos of dynamic scenes, even when applied to each frame independently.

Liangji Fang - One of the best experts on this subject based on the ideXlab platform.

  • edgeStereo a context integrated residual pyramid network for Stereo Matching
    Asian Conference on Computer Vision, 2018
    Co-Authors: Xiao Song, Xu Zhao, Liangji Fang
    Abstract:

    Recent convolutional neural networks, especially end-to-end disparity estimation models, achieve remarkable performance on Stereo Matching task. However, existed methods, even with the complicated cascade structure, may fail in the regions of non-textures, boundaries and tiny details. Focus on these problems, we propose a multi-task network EdgeStereo that is composed of a backbone disparity network and an edge sub-network. Given a binocular image pair, our model enables end-to-end prediction of both disparity map and edge map. Basically, we design a context pyramid to encode multi-scale context information in disparity branch, followed by a compact residual pyramid for cascaded refinement. To further preserve subtle details, our EdgeStereo model integrates edge cues by feature embedding and edge-aware smoothness loss regularization. Comparative results demonstrates that Stereo Matching and edge detection can help each other in the unified model. Furthermore, our method achieves state-of-art performance on both KITTI Stereo and Scene Flow benchmarks, which proves the effectiveness of our design.

  • edgeStereo a context integrated residual pyramid network for Stereo Matching
    arXiv: Computer Vision and Pattern Recognition, 2018
    Co-Authors: Xiao Song, Xu Zhao, Liangji Fang
    Abstract:

    Recently convolutional neural network (CNN) promotes the development of Stereo Matching greatly. Especially those end-to-end Stereo methods achieve best performance. However less attention is paid on encoding context information, simplifying two-stage disparity learning pipeline and improving details in disparity maps. Differently we focus on these problems. Firstly, we propose an one-stage context pyramid based residual pyramid network (CP-RPN) for disparity estimation, in which a context pyramid is embedded to encode multi-scale context clues explicitly. Next, we design a CNN based multi-task learning network called EdgeStereo to recover missing details in disparity maps, utilizing mid-level features from edge detection task. In EdgeStereo, CP-RPN is integrated with a proposed edge detector HED$_\beta$ based on two-fold multi-task interactions. The end-to-end EdgeStereo outputs the edge map and disparity map directly from a Stereo pair without any post-processing or regularization. We discover that edge detection task and Stereo Matching task can help each other in our EdgeStereo framework. Comprehensive experiments on Stereo benchmarks such as Scene Flow and KITTI 2015 show that our method achieves state-of-the-art performance.

Philip H S Torr - One of the best experts on this subject based on the ideXlab platform.

  • domain invariant Stereo Matching networks
    European Conference on Computer Vision, 2020
    Co-Authors: Feihu Zhang, Ruigang Yang, Victor Adrian Prisacariu, Benjamin W Wah, Philip H S Torr
    Abstract:

    State-of-the-art Stereo Matching networks have difficulties in generalizing to new unseen environments due to significant domain differences, such as color, illumination, contrast, and texture. In this paper, we aim at designing a domain-invariant Stereo Matching network (DSMNet) that generalizes well to unseen scenes. To achieve this goal, we propose i) a novel “domain normalization” approach that regularizes the distribution of learned representations to allow them to be invariant to domain differences, and ii) an end-to-end trainable structure-preserving graph-based filter for extracting robust structural and geometric representations that can further enhance domain-invariant generalizations. When trained on synthetic data and generalized to real test sets, our model performs significantly better than all state-of-the-art models. It even outperforms some deep neural network models (e.g. MC-CNN [61]) fine-tuned with test-domain data. The code is available at https://github.com/feihuzhang/DSMNet.

  • domain invariant Stereo Matching networks
    arXiv: Computer Vision and Pattern Recognition, 2019
    Co-Authors: Feihu Zhang, Ruigang Yang, Victor Adrian Prisacariu, Benjamin W Wah, Philip H S Torr
    Abstract:

    State-of-the-art Stereo Matching networks have difficulties in generalizing to new unseen environments due to significant domain differences, such as color, illumination, contrast, and texture. In this paper, we aim at designing a domain-invariant Stereo Matching network (DSMNet) that generalizes well to unseen scenes. To achieve this goal, we propose i) a novel "domain normalization" approach that regularizes the distribution of learned representations to allow them to be invariant to domain differences, and ii) a trainable non-local graph-based filter for extracting robust structural and geometric representations that can further enhance domain-invariant generalizations. When trained on synthetic data and generalized to real test sets, our model performs significantly better than all state-of-the-art models. It even outperforms some deep learning models (e.g. MC-CNN) fine-tuned with test-domain data.

Ruigang Yang - One of the best experts on this subject based on the ideXlab platform.

  • domain invariant Stereo Matching networks
    European Conference on Computer Vision, 2020
    Co-Authors: Feihu Zhang, Ruigang Yang, Victor Adrian Prisacariu, Benjamin W Wah, Philip H S Torr
    Abstract:

    State-of-the-art Stereo Matching networks have difficulties in generalizing to new unseen environments due to significant domain differences, such as color, illumination, contrast, and texture. In this paper, we aim at designing a domain-invariant Stereo Matching network (DSMNet) that generalizes well to unseen scenes. To achieve this goal, we propose i) a novel “domain normalization” approach that regularizes the distribution of learned representations to allow them to be invariant to domain differences, and ii) an end-to-end trainable structure-preserving graph-based filter for extracting robust structural and geometric representations that can further enhance domain-invariant generalizations. When trained on synthetic data and generalized to real test sets, our model performs significantly better than all state-of-the-art models. It even outperforms some deep neural network models (e.g. MC-CNN [61]) fine-tuned with test-domain data. The code is available at https://github.com/feihuzhang/DSMNet.

  • domain invariant Stereo Matching networks
    arXiv: Computer Vision and Pattern Recognition, 2019
    Co-Authors: Feihu Zhang, Ruigang Yang, Victor Adrian Prisacariu, Benjamin W Wah, Philip H S Torr
    Abstract:

    State-of-the-art Stereo Matching networks have difficulties in generalizing to new unseen environments due to significant domain differences, such as color, illumination, contrast, and texture. In this paper, we aim at designing a domain-invariant Stereo Matching network (DSMNet) that generalizes well to unseen scenes. To achieve this goal, we propose i) a novel "domain normalization" approach that regularizes the distribution of learned representations to allow them to be invariant to domain differences, and ii) a trainable non-local graph-based filter for extracting robust structural and geometric representations that can further enhance domain-invariant generalizations. When trained on synthetic data and generalized to real test sets, our model performs significantly better than all state-of-the-art models. It even outperforms some deep learning models (e.g. MC-CNN) fine-tuned with test-domain data.

  • Stereo Matching with color weighted correlation hierarchical belief propagation and occlusion handling
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009
    Co-Authors: Qingxiong Yang, Liang Wang, Ruigang Yang, Henrik Stewenius, David Nister
    Abstract:

    In this paper, we formulate a Stereo Matching algorithm with careful handling of disparity, discontinuity, and occlusion. The algorithm works with a global Matching Stereo model based on an energy-minimization framework. The global energy contains two terms, the data term and the smoothness term. The data term is first approximated by a color-weighted correlation, then refined in occluded and low-texture areas in a repeated application of a hierarchical loopy belief propagation algorithm. The experimental results are evaluated on the Middlebury data sets, showing that our algorithm is the top performer among all the algorithms listed there.

  • Stereo Matching with color weighted correlation hierachical belief propagation and occlusion handling
    Computer Vision and Pattern Recognition, 2006
    Co-Authors: Qyngxiong Yang, Liang Wang, Ruigang Yang, Henrik Stewenius, David Nister
    Abstract:

    In this paper, we formulate an algorithm for the Stereo Matching problem with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global Matching Stereo model based on an energy- minimization framework. The global energy contains two terms, the data term and the smoothness term. The data term is first approximated by a color-weighted correlation, then refined in occluded and low-texture areas in a repeated application of a hierarchical loopy belief propagation algorithm. The experimental results are evaluated on the Middlebury data set, showing that our algorithm is the top performer.

Xiao Song - One of the best experts on this subject based on the ideXlab platform.

  • edgeStereo a context integrated residual pyramid network for Stereo Matching
    Asian Conference on Computer Vision, 2018
    Co-Authors: Xiao Song, Xu Zhao, Liangji Fang
    Abstract:

    Recent convolutional neural networks, especially end-to-end disparity estimation models, achieve remarkable performance on Stereo Matching task. However, existed methods, even with the complicated cascade structure, may fail in the regions of non-textures, boundaries and tiny details. Focus on these problems, we propose a multi-task network EdgeStereo that is composed of a backbone disparity network and an edge sub-network. Given a binocular image pair, our model enables end-to-end prediction of both disparity map and edge map. Basically, we design a context pyramid to encode multi-scale context information in disparity branch, followed by a compact residual pyramid for cascaded refinement. To further preserve subtle details, our EdgeStereo model integrates edge cues by feature embedding and edge-aware smoothness loss regularization. Comparative results demonstrates that Stereo Matching and edge detection can help each other in the unified model. Furthermore, our method achieves state-of-art performance on both KITTI Stereo and Scene Flow benchmarks, which proves the effectiveness of our design.

  • edgeStereo a context integrated residual pyramid network for Stereo Matching
    arXiv: Computer Vision and Pattern Recognition, 2018
    Co-Authors: Xiao Song, Xu Zhao, Liangji Fang
    Abstract:

    Recently convolutional neural network (CNN) promotes the development of Stereo Matching greatly. Especially those end-to-end Stereo methods achieve best performance. However less attention is paid on encoding context information, simplifying two-stage disparity learning pipeline and improving details in disparity maps. Differently we focus on these problems. Firstly, we propose an one-stage context pyramid based residual pyramid network (CP-RPN) for disparity estimation, in which a context pyramid is embedded to encode multi-scale context clues explicitly. Next, we design a CNN based multi-task learning network called EdgeStereo to recover missing details in disparity maps, utilizing mid-level features from edge detection task. In EdgeStereo, CP-RPN is integrated with a proposed edge detector HED$_\beta$ based on two-fold multi-task interactions. The end-to-end EdgeStereo outputs the edge map and disparity map directly from a Stereo pair without any post-processing or regularization. We discover that edge detection task and Stereo Matching task can help each other in our EdgeStereo framework. Comprehensive experiments on Stereo benchmarks such as Scene Flow and KITTI 2015 show that our method achieves state-of-the-art performance.