Flood detection in crisis and disaster management is significantly facilitated by the analysis of synthetic aperture radar (SAR) imagery. Traditional flood detection techniques focus more on SAR image pairs than on the optical level. However, the distinctive characteristics of SAR images, characterized by limited visual information, pervasive speckle noise, and analogous backscatter signals, present formidable obstacles to accurately identifying water bodies and extracting change features. Consequently, the performance of existing methods remains unsatisfactory. This paper addresses this challenge by focusing on disparities between SAR image pairs and introducing a pioneering semantic token-based transformer network, denoted as SemT-Former, to enhance flood detection accuracy. SemT-Former operates by prioritizing changes of interest rather than fully comprehending the entire image scene. This is achieved through the integration of temporal-wise feature representation and the introduction of a class token to capture high-level segmentation associated with changes in water bodies. These innovations augment the model’s capacity to discriminate between genuine changes in water bodies and spurious changes induced by similar signals or speckle noise. The effectiveness of SemT-Former is evaluated through a case study in Khartoum, Sudan, focusing on flood detection and the estimation of damaged farmland near river confluences. Experimental results demonstrate that SemT-Former outperforms existing methods, exhibiting a 90.6% improvement in F1-score and an 88.5% enhancement in IoU. This underscores SemT-Former as a promising solution for precise and effective flood mapping from SAR images. |