arXiv:2511.09558v1 Announce Type: cross 
Abstract: Large Vision Models trained on internet-scale data have demonstrated strong capabilities in segmenting and semantically understanding object parts, even in cluttered, crowded scenes. However, while these models can direct a robot toward the general region of an object, they lack the geometric understanding required to precisely control dexterous robotic hands for 3D grasping. To overcome this, our key insight is to leverage simulation with a force-closure grasping generation pipeline that understands local geometries of the hand and object in the scene. Because this pipeline is slow and requires ground-truth observations, the resulting data is distilled into a diffusion model that operates in real-time on camera point clouds. By combining the global semantic understanding of internet-scale models with the geometric precision of a simulation-based locally-aware force-closure, \our achieves high-performance semantic grasping without any manually collected training data. For visualizations of this please visit our website at https://ifgrasping.github.io/

تقدم دراسة جديدة IFG (إرشادات على نطاق الإنترنت لتوليد القبضات الوظيفية)، التي تعزز قدرات القبضات الروبوتية من خلال دمج نماذج الرؤية الكبيرة مع خط أنابيب توليد القبضات المستند إلى المحاكاة. تتيح هذه الطريقة للروبوتات تحقيق قبضات دلالية عالية الأداء في الوقت الفعلي دون الحاجة إلى بيانات تدريب تم جمعها يدويًا، مما يعالج القيود المفروضة على النماذج الحالية في الفهم الهندسي.

Un nuevo estudio presenta el IFG (Internet-Scale Guidance for Functional Grasping Generation), que mejora las capacidades de agarre de los robots al combinar grandes modelos de visión con un pipeline de generación de agarre basado en simulación. Este enfoque permite a los robots lograr un agarre semántico de alto rendimiento en tiempo real sin necesidad de datos de entrenamiento recopilados manualmente, abordando las limitaciones de los modelos existentes en la comprensión geométrica.

Une nouvelle étude présente l'IFG (Internet-Scale Guidance for Functional Grasping Generation), qui améliore les capacités de préhension des robots en combinant de grands modèles de vision avec un pipeline de préhension basé sur la simulation. Cette approche permet aux robots d'atteindre une préhension sémantique de haute performance en temps réel sans avoir besoin de données d'entraînement collectées manuellement, répondant ainsi aux limites des modèles existants en matière de compréhension géométrique.

A new study presents the IFG (Internet-Scale Guidance for Functional Grasping Generation), which enhances robotic grasping capabilities by combining large vision models with a simulation-based grasping pipeline. This approach allows robots to achieve high-performance semantic grasping in real-time without the need for manually collected training data, addressing the limitations of existing models in geometric understanding.

IFG: Internet-Scale Guidance for Functional Grasping Generation

One More Thing in AI – Your Shortcut to AI Mastery

IFG: Internet-Scale Guidance for Functional Grasping Generation

Was this article worth reading? Share it

One More Thing in AI

LucidQuery AI

4o Image Gen

Guidejar-4eb95b

AIPortalX

Octofy

Ready to build your own newsroom?