arXiv:2511.07833v1 Announce Type: new 
Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful framework for enhancing the reasoning capabilities of large language models (LLMs). However, existing approaches such as Group Relative Policy Optimization (GRPO) and its variants, while effective on reasoning benchmarks, struggle with agentic tasks that require iterative decision-making. We introduce Murphy, a multi-turn reflective optimization framework that extends GRPO by incorporating iterative self-correction during training. By leveraging both quantitative and qualitative execution feedback, Murphy enables models to progressively refine their reasoning across multiple turns. Evaluations on code generation benchmarks with model families such as Qwen and OLMo show that Murphy consistently improves performance, achieving up to a 8% relative gain in pass@1 over GRPO, on similar compute budgets.

تقديم مرفي، وهو إطار تحسين انعكاسي متعدد الأدوار، يعزز قدرات نماذج اللغة الكبيرة (LLMs) من خلال تمكين التصحيح الذاتي التكراري أثناء التدريب. هذه الخطوة مهمة لأنها تعالج قيود الأساليب الحالية مثل تحسين السياسة النسبية الجماعية (GRPO) في المهام المعقدة. تظهر التقييمات أن مرفي يحقق تحسينًا يصل إلى 8% في أداء توليد الشيفرة مقارنةً بـ GRPO.

La introducción de Murphy, un marco de optimización reflexiva de múltiples turnos, mejora las capacidades de los grandes modelos de lenguaje (LLMs) al permitir la autocorrección iterativa durante el entrenamiento. Este avance es significativo ya que aborda las limitaciones de métodos existentes como la Optimización de Política Relativa de Grupo (GRPO) en tareas de toma de decisiones complejas. Las evaluaciones muestran que Murphy logra hasta un 8% de mejora en el rendimiento de generación de código en comparación con GRPO.

L'introduction de Murphy, un cadre d'optimisation réflexive multi-tour, améliore les capacités des grands modèles de langage (LLM) en permettant l'auto-correction itérative pendant l'entraînement. Cette avancée est significative car elle répond aux limitations des méthodes existantes comme l'optimisation de politique relative de groupe (GRPO) dans la gestion de tâches de prise de décision complexes. Les évaluations montrent que Murphy atteint jusqu'à 8 % d'amélioration des performances en génération de code par rapport à GRPO.

The introduction of Murphy, a multi-turn reflective optimization framework, enhances the capabilities of large language models (LLMs) by enabling iterative self-correction during training. This advancement is significant as it addresses the limitations of existing methods like Group Relative Policy Optimization (GRPO) in handling complex decision-making tasks. Evaluations show that Murphy achieves up to an 8% improvement in code generation performance over GRPO.

MURPHY: Multi-Turn GRPO for Self Correcting Code Generation

Guardio is leveraging its experience building browser extensions and apps that scan for malicious and phishing sites to build a tool that looks for artifacts in code and websites made with vibe coding tools.

نجحت شركة Guardio الناشئة في مجال الأمن السيبراني في تأمين 80 مليون دولار من التمويل من ION Crossover Partners. تشتهر الشركة بخبرتها في تطوير إضافات المتصفح والتطبيقات التي تكشف عن المواقع الضارة والمحتالة. تخطط Guardio لاستخدام هذا التمويل لإنشاء أداة تبحث عن الآثار في الشيفرة والمواقع التي تم إنشاؤها باستخدام أدوات البرمجة vibe.

La startup de seguridad Guardio ha asegurado 80 millones de dólares en financiamiento de ION Crossover Partners. La empresa es conocida por su experiencia en el desarrollo de extensiones de navegador y aplicaciones que detectan sitios maliciosos y de phishing. Guardio planea utilizar estos fondos para crear una herramienta que identifique artefactos en el código y sitios web construidos con herramientas de codificación vibe.

La startup de sécurité Guardio a obtenu 80 millions de dollars de financement de la part d'ION Crossover Partners. Connue pour son expertise dans le développement d'extensions de navigateur et d'applications détectant les sites malveillants et de phishing, Guardio prévoit d'utiliser ce financement pour créer un outil identifiant les artefacts dans le code et les sites web construits avec des outils de codage vibe.

Security startup Guardio has secured $80 million in funding from ION Crossover Partners. The company is known for its expertise in developing browser extensions and applications that detect malicious and phishing websites. Guardio plans to utilize this funding to create a tool that identifies artifacts in code and websites built with vibe coding tools.

Security startup Guardio nabs $80M from ION Crossover Partners

A new artificial intelligence startup founded by the creators of <a href="https://opencv.org/">the world&#x27;s most widely used computer vision library</a> has emerged from stealth with technology that generates realistic human-centric videos up to five minutes long — a dramatic leap beyond the capabilities of rivals including OpenAI&#x27;s <a href="https://openai.com/sora/">Sora</a> and Google&#x27;s <a href="https://deepmind.google/models/veo/">Veo</a>.<a href="https://craftstory.com/">CraftStory</a>, which launched Tuesday with $2 million in funding, is introducing Model 2.0, a video generation system that addresses one of the most significant limitations plaguing the nascent AI video industry: duration. While OpenAI&#x27;s <a href="https://openai.com/index/sora-2/">Sora 2</a> tops out at 25 seconds and most competing models generate clips of 10 seconds or less, CraftStory&#x27;s system can produce continuous, coherent video performances that run as long as a typical YouTube tutorial or product demonstration.The breakthrough could unlock substantial commercial value for enterprises struggling to scale video production for training, marketing, and customer education — markets where brief AI-generated clips have proven inadequate despite their visual polish.&quot;If you really try to create a video with one of these video generation systems, you find that a lot of the times you want to implement a certain creative vision, and regardless of how detailed the instructions are, the systems basically ignore a part of your instructions,&quot; said Victor Erukhimov, CraftStory&#x27;s founder and CEO, in an exclusive interview with VentureBeat. &quot;We developed a system that can generate videos basically as long as you need them.&quot;<h3>How parallel processing solves the long-form video problem</h3>CraftStory&#x27;s advance rests on what the company describes as a parallelized diffusion architecture — a fundamentally different approach to how AI models generate video compared to the sequential methods employed by most competitors.Traditional video generation models work by running diffusion algorithms on increasingly large three-dimensional volumes where time represents the third axis. To generate a longer video, these models require proportionally larger networks, more training data, and significantly more computational resources.<a href="https://craftstory.com/">CraftStory</a> instead runs multiple smaller diffusion algorithms simultaneously across the entire duration of the video, with bidirectional constraints connecting them. &quot;The latter part of the video can influence the former part of the video too,&quot; Erukhimov explained. &quot;And this is pretty important, because if you do it one by one, then an artifact that appears in the first part propagates to the second one, and then it accumulates.&quot;Rather than generating eight seconds and then stitching on additional segments, CraftStory&#x27;s system processes all five minutes concurrently through interconnected diffusion processes.Crucially, CraftStory trained its model on proprietary footage rather than relying solely on internet-scraped videos. The company hired studios to shoot actors using high-frame-rate camera systems that capture crisp detail even in fast-moving elements like fingers — avoiding the motion blur inherent in standard 30-frames-per-second YouTube clips.&quot;What we showed is that you don&#x27;t need a lot of data and you don&#x27;t need a lot of training budget to create high quality videos,&quot; Erukhimov said. &quot;You just need high quality data.&quot;Model 2.0 currently operates as a video-to-video system: users upload a still image to animate and a &quot;driving video&quot; containing a person whose movements the AI will replicate. CraftStory provides preset driving videos shot with professional actors, who receive revenue shares when their motion data is used, or users can upload their own footage.The system generates 30-second clips at low resolution in approximately 15 minutes. An advanced lip-sync system synchronizes mouth movements to scripts or audio tracks, while gesture alignment algorithms ensure body language matches speech rhythm and emotional tone.<h3>Fighting a war chest battle with $2 million against billions</h3>CraftStory&#x27;s funding comes almost entirely from <a href="https://finance.yahoo.com/news/2-25-billion-exit-taught-130300997.html">Andrew Filev</a>, who sold his project management software company Wrike to Citrix for <a href="https://techcrunch.com/2021/01/19/citrix-is-acquiring-wrike-from-vista-for-2-25b/">$2.25 billion</a> in 2021 and now runs <a href="https://zencoder.ai/">Zencoder</a>, an AI coding company. The modest raise stands in stark contrast to the billions flowing into competing efforts — OpenAI has <a href="https://www.reuters.com/technology/artificial-intelligence/openai-closes-66-billion-funding-haul-valuation-157-billion-with-investment-2024-10-02/">raised over $6 billion</a> in its latest funding round alone.Erukhimov pushed back on the notion that massive capital is prerequisite for success. &quot;I don&#x27;t necessarily buy the thesis that compute is the path to success,&quot; he said. &quot;It definitely helps if you have compute. But if you raise a billion dollars on a PowerPoint, in the end, no one is happy, neither the founders nor the investors.&quot;Filev defended the David-versus-Goliath approach. &quot;When you invest in startups, you&#x27;re fundamentally betting on people,&quot; he said in an interview with VentureBeat. &quot;To paraphrase Margaret Mead: never underestimate what a small group of thoughtful, committed engineers and scientists can build.&quot;He argued that CraftStory benefits from a focused strategy. &quot;The big labs are in an arms race to build general-purpose video foundation models,&quot; Filev said. &quot;CraftStory is riding that wave and going very deep into a specific format: long-form, engaging, human-centric video.&quot;<h3>Why computer vision expertise matters in generative AI video</h3>Erukhimov&#x27;s credibility stems from his deep roots in computer vision rather than the transformer architectures that have dominated recent AI advances. He was an early contributor to <a href="https://opencv.org/">OpenCV</a> — the Open Source Computer Vision Library that has become the de facto standard for computer vision applications, with over <a href="https://github.com/opencv/opencv">84,000 stars on GitHub</a>.When Intel reduced its support for OpenCV in the mid-2000s, Erukhimov co-founded Itseez with the explicit goal of maintaining and advancing the library. The company expanded OpenCV significantly and pivoted toward automotive safety systems before Intel acquired it in 2016.Filev said this background is precisely what makes Erukhimov well-positioned for video generation. &quot;What people sometimes miss is that generative AI video isn&#x27;t just about the generative part. It&#x27;s about understanding motion, facial dynamics, temporal coherence, and how humans actually move,&quot; Filev said. &quot;Victor has spent his career mastering exactly those problems.&quot;<h3>Enterprise focus targets training videos and product demos</h3>While much of the public excitement around AI video generation has centered on creative tools for consumers, CraftStory is pursuing a decidedly enterprise-focused strategy.&quot;We are definitely thinking about B2B more than consumer,&quot; Erukhimov said. &quot;We&#x27;re thinking about companies, specifically software companies, being able to make cool training videos and product videos and launch videos.&quot;The logic is straightforward: corporate training, product tutorials, and customer education videos often run several minutes and require consistent quality throughout. A 10-second AI clip cannot effectively demonstrate how to use enterprise software or explain a complex product feature.&quot;If you need a longer-form video, then you should go with us,&quot; Erukhimov said. &quot;We can create up to five minutes, consistent video, high quality.&quot;Filev echoed this assessment. &quot;One huge gap in this market is the lack of models that can generate consistent videos over longer sequences — and that&#x27;s extremely important for real-world use,&quot; he said. &quot;If you&#x27;re creating a commercial for your company, a 10-second video, no matter how good it looks, just isn&#x27;t enough. You need 30 seconds, you need two minutes — you need more.&quot;The company anticipates cost savings for customers. Filev suggested that &quot;a small business owner could create content in minutes that previously would have cost $20,000 and taken two months to produce.&quot;CraftStory is also courting creative agencies that produce video content for corporate clients, with the value proposition centered on cost and speed: agencies can record an actor on camera and transform that footage into a finished AI video, rather than managing expensive multi-day shoots.The next major development on CraftStory&#x27;s roadmap is a text-to-video model that would allow users to generate long-form content directly from scripts. The team is also developing support for moving-camera scenarios, including the popular &quot;walk-and-talk&quot; format common in high-end advertising.<h3>Where CraftStory fits in a fragmented competitive landscape</h3>CraftStory enters a crowded and rapidly evolving market. OpenAI&#x27;s <a href="https://openai.com/index/sora-2/">Sora 2</a>, while not yet publicly available, has generated significant buzz. Google&#x27;s <a href="https://deepmind.google/models/veo/">Veo models</a> are advancing quickly. <a href="https://runwayml.com/">Runway</a>, <a href="https://pika.art/login">Pika</a>, and <a href="https://stability.ai/">Stability AI </a>all offer video generation tools with different capabilities.Erukhimov acknowledged the competitive pressure but emphasized that CraftStory serves a distinct niche focused on human-centric videos. He positioned rapid innovation and market capture as the company&#x27;s primary strategy rather than relying on technical moats.Filev sees the market fragmenting into distinct layers, with large tech companies serving as &quot;API providers of powerful, general-purpose generation models&quot; while specialized players like CraftStory focus on specific use cases. &quot;If the big players are building the engines, CraftStory is building the production studio and assembly line on top,&quot; he said.Model 2.0 is available now at app.craftstory.com/model-2.0, with the company offering early access to users and enterprises interested in testing the technology. Whether a lightly-funded startup can capture meaningful market share against deep-pocketed incumbents remains uncertain, but Erukhimov is characteristically confident about the opportunity ahead.&quot;AI-generated video will soon become the primary way companies communicate their stories,&quot; he said.

أطلقت CraftStory، وهي شركة ناشئة جديدة في مجال الذكاء الاصطناعي أسسها مبتكرو OpenCV، نظامًا لتوليد الفيديو قادرًا على إنتاج مقاطع فيديو واقعية تركز على الإنسان تصل مدتها إلى خمس دقائق. تتجاوز هذه التكنولوجيا بشكل كبير قدرات المنافسين مثل Sora من OpenAI وVeo من Google، الذين لديهم حدود زمنية أقصر. حصلت الشركة الناشئة على تمويل بقيمة مليوني دولار لدعم نهجها المبتكر في صناعة الفيديو بالذكاء الاصطناعي.

CraftStory, una nueva startup de IA fundada por los creadores de OpenCV, ha lanzado un sistema de generación de video capaz de producir videos realistas centrados en humanos de hasta cinco minutos de duración. Esta tecnología supera significativamente a competidores como Sora de OpenAI y Veo de Google, que tienen límites de duración más cortos. La startup ha asegurado 2 millones de dólares en financiamiento para apoyar su enfoque innovador en la industria de video de IA.

CraftStory, une nouvelle startup d'IA fondée par les créateurs d'OpenCV, a lancé un système de génération vidéo capable de produire des vidéos réalistes centrées sur l'humain d'une durée allant jusqu'à cinq minutes. Cette technologie surpasse considérablement les concurrents tels que Sora d'OpenAI et Veo de Google, qui ont des limites de durée plus courtes. La startup a sécurisé 2 millions de dollars de financement pour soutenir son approche innovante de l'industrie vidéo IA.

CraftStory, a new AI startup founded by the creators of OpenCV, has launched a video generation system capable of producing realistic human-centric videos up to five minutes long. This technology significantly outpaces competitors like OpenAI's Sora and Google's Veo, which have shorter duration limits. The startup has secured $2 million in funding to support its innovative approach to the AI video industry.

OpenCV founders launch AI video startup to take on OpenAI and Google

<a href="https://petapixel.com/2025/11/19/remote-cameras-may-have-captured-first-recorded-tool-use-by-a-wild-wolf/"><img width="1600" height="840" src="https://petapixel.com/assets/uploads/2025/11/wolve-tool-use.jpg" class="attachment-card-large size-card-large wp-post-image" alt="A wolf stands at the edge of a rocky shoreline, holding an orange and white fishing bobber in its mouth. A fishing net and additional gear are lying on the ground nearby. Rippling water is in the background." decoding="async" fetchpriority="high" /></a>Remote cameras have captured footage of wild wolves pulling crab traps out of the sea by their lines to eat the bait inside -- in the first evidence of possible tool use by the canines.
[<a href="https://petapixel.com/2025/11/19/remote-cameras-may-have-captured-first-recorded-tool-use-by-a-wild-wolf/">Read More</a>]

التقطت الكاميرات عن بُعد لقطات لذئاب برية تستخدم تقنية تتضمن سحب فخاخ السلطعون من البحر للوصول إلى الطعم داخلها. وهذا يمثل أول دليل موثق على استخدام محتمل للأدوات من قبل هذه الكلاب.

Cámaras remotas han grabado a lobos salvajes utilizando una técnica que consiste en sacar trampas de cangrejos del mar para acceder al cebo en su interior. Esto marca la primera evidencia documentada de un posible uso de herramientas por parte de estos caninos.

Des caméras à distance ont enregistré des loups sauvages utilisant une technique qui consiste à tirer des pièges à crabes de la mer pour accéder à l'appât à l'intérieur. Cela marque la première preuve documentée d'un potentiel usage d'outils par ces canidés.

Remote cameras have recorded wild wolves using a technique that involves pulling crab traps from the sea to access the bait inside. This marks the first documented evidence of potential tool use by these canines.

Remote Cameras May Have Captured First Recorded Tool Use by a Wild Wolf

The AI startup Firebird Inc. has received US government approval to export Nvidia Corp. chips to Armenia for a supercomputer project in the country, part of a global push to expand artificial intelligence infrastructure.

حصلت شركة Firebird Inc. الناشئة في مجال الذكاء الاصطناعي على موافقة الحكومة الأمريكية لتصدير شرائح Nvidia Corp. إلى أرمينيا، مما يسهل إنشاء مشروع حاسوب فائق في البلاد. تأتي هذه المبادرة كجزء من جهد عالمي أوسع لتعزيز بنية الذكاء الاصطناعي التحتية.

La startup de IA Firebird Inc. ha recibido la aprobación del gobierno de EE. UU. para exportar chips de Nvidia Corp. a Armenia, facilitando el establecimiento de un proyecto de supercomputadora en el país. Esta iniciativa forma parte de un esfuerzo global más amplio para mejorar la infraestructura de inteligencia artificial.

La startup d'IA Firebird Inc. a obtenu l'approbation du gouvernement américain pour exporter des puces de Nvidia Corp. en Arménie, facilitant ainsi l'établissement d'un projet de superordinateur dans le pays. Cette initiative s'inscrit dans un effort mondial plus large pour améliorer l'infrastructure de l'intelligence artificielle.

AI startup Firebird Inc. has obtained approval from the US government to export Nvidia Corp. chips to Armenia, facilitating the establishment of a supercomputer project in the country. This initiative is part of a broader global effort to enhance artificial intelligence infrastructure.

AI Startup Firebird Gets US Approval to Use Nvidia Chips in Armenian Data Center

Gesture of goodwill temporarily cools EU–China dispute that rattled global car supply chains.
The post <a href="https://www.techrepublic.com/article/news-netherlands-nexperia-chipmaker-control/">Netherlands Pauses Move to Seize Chinese-Owned Chipmaker Nexperia</a> appeared first on <a href="https://www.techrepublic.com">TechRepublic</a>.

أوقفت هولندا مؤقتًا جهودها للاستيلاء على شركة Nexperia لصناعة الرقائق المملوكة للصين، وهو إجراء يُعتبر بمثابة لفتة حسن نية تهدف إلى تهدئة التوترات في النزاع القائم بين الاتحاد الأوروبي والصين. تأتي هذه الخطوة وسط مخاوف بشأن تأثيرها على سلاسل الإمداد العالمية التي تأثرت بالفعل بنقص أشباه الموصلات.

Los Países Bajos han detenido temporalmente sus esfuerzos por apoderarse del fabricante de chips Nexperia, de propiedad china, un gesto que se considera como una buena voluntad para aliviar las tensiones en la disputa entre la UE y China. Esta decisión se produce en medio de preocupaciones sobre el impacto en las cadenas de suministro globales, que ya se han visto afectadas por la escasez de semiconductores.

Les Pays-Bas ont temporairement suspendu leurs efforts pour saisir le fabricant de puces Nexperia, détenu par des Chinois, un geste perçu comme une volonté d'apaiser les tensions dans le conflit en cours entre l'UE et la Chine. Cette décision intervient dans un contexte d'inquiétudes concernant l'impact sur les chaînes d'approvisionnement mondiales, déjà affectées par des pénuries de semi-conducteurs.

The Netherlands has temporarily halted its efforts to seize control of the Chinese-owned chipmaker Nexperia, a move seen as a gesture of goodwill aimed at easing tensions in the ongoing EU-China dispute. This decision comes amid concerns over the impact on global car supply chains, which have been affected by semiconductor shortages.

Netherlands Pauses Move to Seize Chinese-Owned Chipmaker Nexperia

استقال لاري سامرز من مجلس إدارة OpenAI، كما أفاد نيويورك تايمز. تأتي استقالته بعد التدقيق في اتصالاته السابقة مع المدان بجريمة الاعتداء الجنسي جيفري إبستين. تمثل هذه الخطوة انسحابًا كبيرًا لسامرز من الأدوار العامة وسط انتقادات متزايدة.

Larry Summers ha renunciado a la junta de OpenAI, según informa The New York Times. Su renuncia se produce tras el escrutinio sobre sus comunicaciones pasadas con el delincuente sexual condenado Jeffrey Epstein. Esta decisión marca un paso significativo en el retiro de Summers de los roles públicos en medio de una creciente crítica.

Larry Summers a démissionné du conseil d'administration d'OpenAI, selon le New York Times. Sa démission fait suite à un examen minutieux de ses communications passées avec le délinquant sexuel condamné Jeffrey Epstein. Cette décision marque un retrait significatif de Summers de ses rôles publics face à une critique croissante.

Larry Summers has resigned from the board of OpenAI, as reported by The New York Times. His resignation follows scrutiny over his past communications with convicted sex offender Jeffrey Epstein. This decision marks a significant step in Summers' withdrawal from public roles amid growing criticism.

Larry Summers Resigns From OpenAI’s Board

arXiv:2510.23049v2 Announce Type: replace 
Abstract: This note reconciles two seemingly distinct approaches to policy gradient optimization for the Pass@K objective in reinforcement learning with verifiable rewards: (1) direct REINFORCE-style methods, and (2) advantage-shaping techniques that directly modify GRPO. We show that these are two sides of the same coin. By reverse-engineering existing advantage-shaping algorithms, we reveal that they implicitly optimize surrogate rewards. We specifically interpret practical "hard-example up-weighting" modifications to GRPO as reward-level regularization. Conversely, starting from surrogate reward objectives, we provide a simple recipe for deriving both existing and new advantage-shaping methods. This perspective provides a lens for RLVR policy gradient optimization beyond our original motivation of Pass@K.

تناقش المقالة التوفيق بين نهجين مختلفين لتحسين تدرجات السياسة لهدف Pass@K في التعلم المعزز، وتحديداً أساليب REINFORCE المباشرة وتقنيات تشكيل المزايا التي تعدل GRPO. تكشف أن هذه الأساليب هي وجهان لعملة واحدة وتفسر التعديلات على وزن الأمثلة الصعبة كتنظيم على مستوى المكافآت. بالإضافة إلى ذلك، تقدم وصفة لاشتقاق كل من أساليب تشكيل المزايا الحالية والجديدة، مما يوفر رؤى حول تحسين تدرجات السياسة RLVR بما يتجاوز التركيز الأصلي على Pass@K.

El artículo discute la reconciliación de dos enfoques distintos para la optimización de gradientes de política para el objetivo Pass@K en el aprendizaje por refuerzo, específicamente los métodos directos estilo REINFORCE y las técnicas de modelado de ventaja que modifican GRPO. Revela que estos métodos son dos caras de la misma moneda e interpreta las modificaciones de ponderación de ejemplos difíciles como regularización a nivel de recompensa. Además, proporciona una receta para derivar tanto métodos de modelado de ventaja existentes como nuevos, ofreciendo perspectivas sobre la optimización …

Cet article traite de la réconciliation de deux approches distinctes de l'optimisation des gradients de politique pour l'objectif Pass@K dans l'apprentissage par renforcement, en particulier les méthodes de style REINFORCE directes et les techniques de modelage d'avantage qui modifient GRPO. Il révèle que ces méthodes sont deux faces d'une même pièce et interprète les modifications de pondération des exemples difficiles comme une régularisation au niveau des récompenses. De plus, il fournit une recette pour dériver à la fois des méthodes de modelage d'avantage existantes et nouvelles, offrant …

The article discusses the reconciliation of two distinct approaches to policy gradient optimization for the Pass@K objective in reinforcement learning, specifically direct REINFORCE-style methods and advantage-shaping techniques that modify GRPO. It reveals that these methods are two sides of the same coin and interprets hard-example up-weighting modifications as reward-level regularization. Additionally, it provides a recipe for deriving both existing and new advantage-shaping methods, offering insights into RLVR policy gradient optimization beyond the initial focus on Pass@K.

Advantage Shaping as Surrogate Reward Maximization: Unifying Pass@K Policy Gradients

arXiv:2511.14268v1 Announce Type: cross 
Abstract: Heterogeneous porous materials play a crucial role in various engineering systems. Microstructure characterization and reconstruction provide effective means for modeling these materials, which are critical for conducting physical property simulations, structure-property linkage studies, and enhancing their performance across different applications. To achieve superior controllability and applicability with small sample sizes, we propose a statistically controllable microstructure reconstruction framework that integrates neural networks with sliced-Wasserstein metric. Specifically, our approach leverages local pattern distribution for microstructure characterization and employs a controlled sampling strategy to generate target distributions that satisfy given conditional parameters. A neural network-based model establishes the mapping from the input distribution to the target local pattern distribution, enabling microstructure reconstruction. Combinations of sliced-Wasserstein metric and gradient optimization techniques minimize the distance between these distributions, leading to a stable and reliable model. Our method can perform stochastic and controllable reconstruction tasks even with small sample sizes. Additionally, it can generate large-size (e.g. 512 and 1024) 3D microstructures using a chunking strategy. By introducing spatial location masks, our method excels at generating spatially heterogeneous and complex microstructures. We conducted experiments on stochastic reconstruction, controllable reconstruction, heterogeneous reconstruction, and large-size microstructure reconstruction across various materials. Comparative analysis through visualization, statistical measures, and physical property simulations demonstrates the effectiveness, providing new insights and possibilities for research on structure-property linkage and material inverse design.

تم اقتراح إطار عمل جديد لإعادة بناء الميكروستركشر للمواد المسامية غير المتجانسة، حيث يتم دمج الشبكات العصبية مع مقياس ووترستين المقطوع. تعزز هذه الطريقة من توصيف وإعادة بناء الميكروستركشر، وهما أمران أساسيان لنمذجة هذه المواد في التطبيقات الهندسية. من خلال استخدام توزيع الأنماط المحلية واستراتيجية أخذ عينات محكومة، يهدف الإطار إلى تحسين القابلية للتحكم والتطبيق في إعادة بناء الميكروستركشر، حتى مع أحجام عينات صغيرة.

Se ha propuesto un nuevo marco para la reconstrucción de la microestructura de materiales heterogéneos porosos, integrando redes neuronales con la métrica de Wasserstein cortada. Este enfoque mejora la caracterización y reconstrucción de la microestructura, que son esenciales para modelar materiales en aplicaciones de ingeniería. Al utilizar la distribución de patrones locales y una estrategia de muestreo controlado, el marco busca mejorar la controlabilidad y aplicabilidad de la reconstrucción de microestructuras, incluso con tamaños de muestra pequeños.

Un nouveau cadre pour la reconstruction de la microstructure des matériaux hétérogènes poreux a été proposé, intégrant des réseaux de neurones avec la métrique de Wasserstein tranchée. Cette approche améliore la caractérisation et la reconstruction de la microstructure, essentielles pour modéliser les matériaux dans les applications d'ingénierie. En utilisant la distribution des motifs locaux et une stratégie d'échantillonnage contrôlé, le cadre vise à améliorer la contrôlabilité et l'applicabilité de la reconstruction de la microstructure, même avec de petites tailles d'échantillons.

A new framework for reconstructing the microstructure of heterogeneous porous materials has been proposed, integrating neural networks with the sliced-Wasserstein metric. This approach enhances microstructure characterization and reconstruction, which are essential for modeling materials in engineering applications. By utilizing local pattern distribution and a controlled sampling strategy, the framework aims to improve the controllability and applicability of microstructure reconstruction, even with small sample sizes.

Statistically controllable microstructure reconstruction framework for heterogeneous materials using sliced-Wasserstein metric and neural networks

arXiv:2408.00540v4 Announce Type: replace-cross 
Abstract: Artificial Intelligence (AI) is being incorporated in several optimization, scheduling, orchestration as well as in native communication network functions. This paradigm shift results in increased energy consumption, however, quantifying the end-to-end energy consumption of adding intelligence to communication systems remains an open challenge since conventional energy consumption metrics focus on either communication, computation infrastructure, or model development. To address this, we propose a new metric, the Energy Cost of AI Lifecycle (eCAL) of an AI model in a system. eCAL captures the energy consumption throughout the development, deployment and utilization of an AI-model providing intelligence in a communication network by (i) analyzing the complexity of data collection and manipulation in individual components and (ii) deriving overall and per-bit energy consumption. We show that as a trained AI model is used more frequently for inference, its energy cost per inference decreases, since the fixed training energy is amortized over a growing number of inferences. For a simple case study we show that eCAL for 100 inferences is 2.73 times higher than for 1000 inferences. Additionally, we have developed a modular and extendable open-source simulation tool to enable researchers, practitioners, and engineers to calculate the end-to-end energy cost with various configurations and across various systems, ensuring adaptability to diverse use cases.

يتناول المقال دمج الذكاء الاصطناعي (AI) في شبكات الاتصال، مشيرًا إلى زيادة استهلاك الطاقة المرتبطة بهذا التحول. يقدم مقياسًا جديدًا يسمى تكلفة الطاقة لدورة حياة الذكاء الاصطناعي (eCAL)، والذي يقيس الطاقة المستخدمة خلال تطوير ونشر واستخدام نماذج الذكاء الاصطناعي في أنظمة الاتصال. تؤكد الدراسة على الحاجة إلى فهم شامل لمقاييس استهلاك الطاقة، التي تركز تقليديًا على الاتصال أو بنية الحوسبة أو تطوير النماذج.

El artículo aborda la integración de la inteligencia artificial (IA) en las redes de comunicación, destacando el aumento del consumo de energía asociado con este cambio. Presenta una nueva métrica llamada Costo Energético del Ciclo de Vida de la IA (eCAL), que cuantifica la energía utilizada durante el desarrollo, implementación y utilización de modelos de IA en sistemas de comunicación. El estudio enfatiza la necesidad de una comprensión integral de las métricas de consumo de energía, que tradicionalmente se centran en la comunicación, infraestructura de computación o desarrollo de modelos.

L'article traite de l'intégration de l'intelligence artificielle (IA) dans les réseaux de communication, soulignant l'augmentation de la consommation d'énergie associée à ce changement. Il présente un nouveau métrique appelé le Coût Énergétique du Cycle de Vie de l'IA (eCAL), qui quantifie l'énergie utilisée lors du développement, du déploiement et de l'utilisation des modèles d'IA dans les systèmes de communication. L'étude met en avant la nécessité d'une compréhension globale des métriques de consommation d'énergie, qui se concentrent traditionnellement sur la communication, l'infrastructure…

The article discusses the integration of Artificial Intelligence (AI) into communication networks, highlighting the increased energy consumption associated with this shift. It presents a new metric called the Energy Cost of AI Lifecycle (eCAL), which quantifies the energy used during the development, deployment, and utilization of AI models in communication systems. The study emphasizes the need for a comprehensive understanding of energy consumption metrics, which traditionally focus on communication, computation infrastructure, or model development.

The Energy Cost of Artificial Intelligence Lifecycle in Communication Networks

arXiv:2511.14465v1 Announce Type: new 
Abstract: Mechanistic interpretability research requires reliable tools for analyzing transformer internals across diverse architectures. Current approaches face a fundamental tradeoff: custom implementations like TransformerLens ensure consistent interfaces but require coding a manual adaptation for each architecture, introducing numerical mismatch with the original models, while direct HuggingFace access through NNsight preserves exact behavior but lacks standardization across models. To bridge this gap, we develop nnterp, a lightweight wrapper around NNsight that provides a unified interface for transformer analysis while preserving original HuggingFace implementations. Through automatic module renaming and comprehensive validation testing, nnterp enables researchers to write intervention code once and deploy it across 50+ model variants spanning 16 architecture families. The library includes built-in implementations of common interpretability methods (logit lens, patchscope, activation steering) and provides direct access to attention probabilities for models that support it. By packaging validation tests with the library, researchers can verify compatibility with custom models locally. nnterp bridges the gap between correctness and usability in mechanistic interpretability tooling.

يتناول المقال nnterp، وهي أداة جديدة مصممة لتعزيز البحث في التفسير الميكانيكي لنماذج المحولات. تواجه الأساليب الحالية تحديات في التوحيد والدقة العددية عند تحليل هياكل مختلفة. تعمل nnterp كغلاف خفيف حول NNsight، مما يوفر واجهة موحدة لتحليل المحولات مع الحفاظ على تنفيذات HuggingFace الأصلية. تتيح هذه الأداة للباحثين كتابة كود التدخل مرة واحدة وتطبيقه عبر أكثر من 50 نموذجًا متنوعًا من 16 عائلة معمارية، مما يسهل الاختبارات الشاملة للتفسير.

El artículo presenta nnterp, una nueva herramienta diseñada para mejorar la investigación sobre la interpretabilidad mecanicista de los modelos de transformadores. Los métodos actuales enfrentan desafíos en la estandarización y precisión numérica al analizar diferentes arquitecturas. nnterp actúa como un envoltorio ligero alrededor de NNsight, proporcionando una interfaz unificada para el análisis de transformadores mientras mantiene las implementaciones originales de HuggingFace. Permite a los investigadores escribir código de intervención una vez y aplicarlo a más de 50 variantes de modelos …

L'article présente nnterp, un nouvel outil conçu pour améliorer la recherche sur l'interprétabilité mécaniste des modèles de transformateurs. Les méthodes actuelles rencontrent des défis en matière de standardisation et de précision numérique lors de l'analyse de différentes architectures. nnterp agit comme un wrapper léger autour de NNsight, offrant une interface unifiée pour l'analyse des transformateurs tout en maintenant les implémentations originales de HuggingFace. Il permet aux chercheurs d'écrire un code d'intervention une fois et de l'appliquer à plus de 50 variantes de modèles proven…

The article discusses nnterp, a new tool designed to enhance mechanistic interpretability research for transformer models. Current methods face challenges in standardization and numerical accuracy when analyzing different architectures. nnterp serves as a lightweight wrapper around NNsight, providing a unified interface for transformer analysis while maintaining the original HuggingFace implementations. It allows researchers to write intervention code once and apply it across over 50 model variants from 16 architecture families, facilitating comprehensive interpretability testing.

MURPHY: Multi-Turn GRPO for Self Correcting Code Generation

Was this article worth reading? Share it