World PulseNowPowered by AI

Trending:

MambaRefine-YOLO: A Dual-Modality Small Object Detector for UAV Imagery

arXiv — cs.CV•Tuesday, November 25, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

MambaRefine-YOLO has been introduced as a dual-modality small object detector specifically designed for Unmanned Aerial Vehicle (UAV) imagery, addressing the challenges of low resolution and background clutter in small object detection. The model incorporates a Dual-Gated Complementary Mamba fusion module (DGC-MFM) and a Hierarchical Feature Aggregation Neck (HFAN), achieving a state-of-the-art mean Average Precision (mAP) of 83.2% on the DroneVehicle dataset.
This development is significant as it enhances the capabilities of UAVs in detecting small objects, which is crucial for various applications including surveillance, agriculture, and environmental monitoring. The innovative fusion of RGB and infrared data allows for improved accuracy and efficiency, setting a new benchmark in the field of aerial imagery analysis.
The introduction of MambaRefine-YOLO reflects a growing trend in AI and machine learning towards integrating multiple data modalities to overcome limitations in traditional detection methods. This aligns with ongoing research efforts to enhance object detection in diverse conditions, including adverse weather and complex environments, underscoring the importance of robust algorithms in advancing UAV technology.

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps

Lenso.ai

Find any image instantly with AI-powered reverse search.

AI & DataTry the app

SnapChip

Find and source electronic components faster with AI-powered assistance.

AI & DataTry the app

Republiclabs.ai

Generate custom images and videos with the people's AI playground.

Creative & DesignTry the app

Continue Readings

A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles

arXiv — cs.CVa day ago

A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles

PositiveArtificial Intelligence

A new dataset named MM-UAV has been introduced, designed for tracking unmanned aerial vehicles (UAVs) using a multi-modal approach that includes RGB, infrared, and event signals. This dataset features over 30 challenging scenarios with 1,321 synchronized sequences and more than 2.8 million annotated frames, addressing the limitations of single-modality tracking in difficult conditions.

Read full article

via arXiv — cs.CV

A Theory-Inspired Framework for Few-Shot Cross-Modal Sketch Person Re-Identification

arXiv — cs.CVa day ago

A Theory-Inspired Framework for Few-Shot Cross-Modal Sketch Person Re-Identification

PositiveArtificial Intelligence

A new framework called KTCAA has been introduced for few-shot cross-modal sketch person re-identification, aiming to bridge the gap between hand-drawn sketches and RGB surveillance images. This framework addresses challenges related to domain discrepancy and perturbation invariance, proposing innovative components like Alignment Augmentation and Knowledge Transfer Catalyst to enhance model robustness and alignment capabilities.

Read full article

via arXiv — cs.CV

One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control

arXiv — cs.CVa day ago

One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control

PositiveArtificial Intelligence

One4D has been introduced as a unified framework for 4D generation and reconstruction, capable of producing dynamic 4D content through synchronized RGB frames and pointmaps. This framework utilizes a Unified Masked Conditioning mechanism to handle varying sparsities of conditioning frames, allowing for seamless transitions between 4D generation from a single image and reconstruction from full videos or sparse frames.

Read full article

via arXiv — cs.CV

Enhancing UAV Search under Occlusion using Next Best View Planning

arXiv — cs.CVa day ago

Enhancing UAV Search under Occlusion using Next Best View Planning

PositiveArtificial Intelligence

Recent advancements in unmanned aerial vehicle (UAV) technology have led to the development of an optimized planning strategy for search and rescue missions in occluded environments, such as dense forests. This strategy focuses on enhancing the effectiveness of UAVs by optimizing camera positioning and perspective to capture clearer ground views during critical missions following natural disasters.

Read full article

via arXiv — cs.CV

Roadside Monocular 3D Detection Prompted by 2D Detection

arXiv — cs.CVa day ago

Roadside Monocular 3D Detection Prompted by 2D Detection

PositiveArtificial Intelligence

The introduction of the Promptable 3D Detector (Pro3D) marks a significant advancement in roadside monocular 3D detection, which involves identifying objects in RGB frames and predicting their 3D attributes, such as bird's-eye-view locations. This innovation leverages 2D detections as prompts to enhance the accuracy and efficiency of 3D detection processes.

Read full article

via arXiv — cs.CV

SPAGS: Sparse-View Articulated Object Reconstruction from Single State via Planar Gaussian Splatting

arXiv — cs.CVa day ago

SPAGS: Sparse-View Articulated Object Reconstruction from Single State via Planar Gaussian Splatting

PositiveArtificial Intelligence

A new framework for articulated object reconstruction has been proposed, utilizing planar Gaussian Splatting to reconstruct 3D objects from sparse-view RGB images captured from a single state. This method introduces a Gaussian information field to optimize viewpoint selection and employs a coarse-to-fine optimization strategy for depth estimation and part segmentation.

Read full article

via arXiv — cs.CV

Meta Policy Switching for Secure UAV Deconfliction in Adversarial Airspace

arXiv — cs.LGa day ago

Meta Policy Switching for Secure UAV Deconfliction in Adversarial Airspace

PositiveArtificial Intelligence

A new framework for autonomous UAV navigation has been proposed, focusing on meta-policy switching to enhance resilience against adversarial attacks that manipulate sensor inputs. This approach utilizes a discounted Thompson sampling mechanism to dynamically select robust policies, addressing the limitations of traditional reinforcement learning methods in adversarial airspace.

Read full article

via arXiv — cs.LG