WRD-Net: Water Reflection Detection Using A Parallel Attention Transformer


The architecture of the proposed WRD-Net.

Abstract
In contrast to symmetry detection, Water Reflection Detection (WRD) is less studied. We treat this topic as a Symmetry Axis Point Prediction task which outputs a set of points by implicitly learning Gaussian heat maps and explicitly learning numerical coordinates. We first collect a new data set, namely, the Water Reflection Scene Data Set (WRSD). Then, we introduce a novel Water Reflection Detection Network, i.e., WRD-Net. This network is built on top of a series of Parallel Attention Vision Transformer blocks with the Atrous Spatial Pyramid (ASP-PAViT) that we deliberately design. Each block captures both the local and global features at multiple scales. To our knowledge, neither the WRSD nor the WRD-Net has been used for water reflection detection before. To derive the axis of symmetry, we perform Principal Component Analysis (PCA) on the points predicted. Experimental results show that the WRD-Net outperforms its counterparts and achieves the true positive rate of 0.823 compared with the human annotation. Keywords: Water Reflection Detection, Symmetry Detection, Line Detection, Multi-scale Deep Networks, Parallel Attention Transformer
Water Reflection Scene Data Set (WRSD)
Statistics of the Data Set

Statistics of the WRSD images, including the distribution of the lengths of all ground-truth axes (a), the distribution of the angles of all ground-truth axes (b) and the distribution of the mean y-coordinate values of the points annotated in each image (c).

Experimental Results

Comparison of the baselines and WRD-Net in terms of different performance metrics.

Comparison of the TP Rate values derived using (a) image feature-based baselines and (b) learning-based baselines against our approach in terms of different combined thresholds.


Each row shows the results derived using nine baselines and our method. The yellow line indicates the detected axis while the blue line suggests the ground-truth axis. The two values shown below each image indicate the angle and distance computed between the detected axis and the ground-truth axis.