View a PDF of the paper titled Towards Underwater Camouflaged Object Tracking: Benchmark and Baselines, by Chunhui Zhang and 5 other authors
Abstract:Over the past decade, significant progress has been made in visual object tracking, largely due to the availability of large-scale datasets. However, existing tracking datasets are primarily focused on open-air scenarios, which greatly limits the development of object tracking in underwater environments. To bridge this gap, we take a step forward by proposing the first large-scale multimodal underwater camouflaged object tracking dataset, namely UW-COT220. Based on the proposed dataset, this paper first comprehensively evaluates current advanced visual object tracking methods and SAM- and SAM2-based trackers in challenging underwater environments. Our findings highlight the improvements of SAM2 over SAM, demonstrating its enhanced ability to handle the complexities of underwater camouflaged objects. Furthermore, we propose a novel vision-language tracking framework called VL-SAM2, based on the video foundation model SAM2. Experimental results demonstrate that our VL-SAM2 achieves state-of-the-art performance on the UW-COT220 dataset. The dataset and codes can be accessible at color{magenta}{this https URL}.
Submission history
From: Chunhui Zhang [view email]
[v1]
Wed, 25 Sep 2024 13:10:03 UTC (574 KB)
[v2]
Mon, 20 Jan 2025 13:01:46 UTC (686 KB)
Source link
lol