Text-based Aerial-Ground Person Retrieval
PositiveArtificial Intelligence
The recent development of Text-based Aerial-Ground Person Retrieval (TAG-PR) represents a notable step forward in the field of image retrieval, particularly in addressing the complexities of retrieving person images from disparate aerial and ground perspectives. This innovation is underscored by the introduction of the TAG-PEDES dataset, which is constructed from public benchmarks and features automatically generated textual descriptions, ensuring robustness against view heterogeneity. Complementing this dataset is the TAG-CLIP retrieval framework, designed to effectively manage the challenges of viewpoint discrepancies through a mixture of experts module that learns both view-specific and view-agnostic features. The effectiveness of TAG-CLIP has been evaluated on the TAG-PEDES dataset as well as existing benchmarks, demonstrating its potential for practical applications. Both the dataset and the code are accessible on GitHub, facilitating further research and development in this area.
— via World Pulse Now AI Editorial System
