Construction robots have conquered the indoor and outdoor building decoration fields, aiming to automatically accomplish
the manually performed tasks efficiently, thus reducing the dependence on human labor and saving time. Fixing the putty on
walls is a labor-intensive and slow process, so incorporating construction robots into such a task is significant.While fixing the
putty on walls, putty bulges emerge in various positions within the working space. To successfully realize this task, the robots
must autonomously determine the putty bulge positions within the working area. Integrating visual attention mechanisms into
convolutional neural networks has been proven to enhance their feature extraction capability. We proposed a deep learning
model for regressing the putty bulge terminal points spatial positions. Two novel visual attention modules were proposed
and precisely integrated into the model’s backbone. For enhancing the extraction of semantic features and better formulating
the channel dependency, a residual channel attention module (RCAM) was proposed. A lightweight spatial attention module
(LSAM) was proposed to maximize the weights of significant spatial information so the model can localize the bulge terminal
points more accurately. The features generated by the attention modules at multiple scales were fused by a proposed attention
feature fusion module (AFFM) to accomplish the putty bulge terminal points regression task. Our experiments proved that
fusing the hierarchical feature maps extracted by the proposed attention modules is significantly better than the traditional
learning scheme that directly propagates the feature maps throughout the network architecture. |