Abstract
Disparity estimation is the process of obtaining the depth information from the left and right views of a particular scene. A recent work based on convolutional neural network (CNN) has achieved state-of-the-art performance for the disparity estimation task. However, this network has some limitations for measuring small and large disparities, which compromises the accuracy of the obtained results. In this paper, a multi-resolution framework with a three-phase strategy to generate high quality disparity maps is proposed, which handles both small and large displacements and retains the details of the scene. The first phase up/down-samples the images to several different resolutions to improve the matching process between CNN feature maps where scaled information is obtained for objects with various sizes and distances. The second phase uses a deep CNN to estimate the disparity maps using the resampled versions, and each version is suitable for a specific range of disparities. Finally, the best fitting disparity map is adaptively selected. To the best of our knowledge, our framework is the first to exploit multiple resolutions of the stereo pair with convolutional neural network for disparity estimation. Significant performance gain is achieved with this proposed method, the mean absolute error is reduced to 3.40 from 5.66, the DispNetC performance for the Sintel dataset.