arXiv:2511.07629v1 Announce Type: new 
Abstract: Offline multi-agent reinforcement learning (MARL) is severely hampered by the challenge of evaluating out-of-distribution (OOD) joint actions. Our core finding is that when the behavior policy is factorized - a common scenario where agents act fully or partially independently during data collection - a strategy of partial action replacement (PAR) can significantly mitigate this challenge. PAR updates a single or part of agents' actions while the others remain fixed to the behavioral data, reducing distribution shift compared to full joint-action updates. Based on this insight, we develop Soft-Partial Conservative Q-Learning (SPaCQL), using PAR to mitigate OOD issue and dynamically weighting different PAR strategies based on the uncertainty of value estimation. We provide a rigorous theoretical foundation for this approach, proving that under factorized behavior policies, the induced distribution shift scales linearly with the number of deviating agents rather than exponentially with the joint-action space. This yields a provably tighter value error bound for this important class of offline MARL problems. Our theoretical results also indicate that SPaCQL adaptively addresses distribution shift using uncertainty-informed weights. Our empirical results demonstrate SPaCQL enables more effective policy learning, and manifest its remarkable superiority over baseline algorithms when the offline dataset exhibits the independence structure.

تقدم دراسة جديدة حول التعلم المعزز متعدد الوكلاء غير المتصل (MARL) استبدال الإجراء الجزئي (PAR) كحل للتحدي المتمثل في تقييم الإجراءات المشتركة خارج التوزيع. تتيح هذه الطريقة تحديث بعض إجراءات الوكلاء فقط مع إبقاء الآخرين ثابتين، مما يقلل بشكل كبير من تحول التوزيع. كما تقدم الدراسة التعلم Q المحافظ الجزئي الناعم (SPaCQL)، الذي يستخدم PAR لتحسين تعلم السياسات، مما يثبت فعاليته مقارنة بالخوارزميات التقليدية.

Un nuevo estudio sobre el aprendizaje por refuerzo multiagente fuera de línea (MARL) presenta el Reemplazo de Acción Parcial (PAR) como solución al desafío de evaluar acciones conjuntas fuera de distribución. Este método permite actualizar solo algunas acciones de los agentes mientras se mantienen fijas las demás, reduciendo significativamente el cambio de distribución. La investigación también presenta el Soft-Partial Conservative Q-Learning (SPaCQL), que utiliza PAR para mejorar el aprendizaje de políticas, demostrando su efectividad sobre los algoritmos tradicionales.

Une nouvelle étude sur l'apprentissage par renforcement multi-agent hors ligne (MARL) présente le remplacement d'action partiel (PAR) comme solution au défi d'évaluation des actions conjointes hors distribution. Cette méthode permet de mettre à jour uniquement certaines actions d'agents tout en gardant les autres fixes, réduisant ainsi considérablement le décalage de distribution. La recherche présente également le Soft-Partial Conservative Q-Learning (SPaCQL), qui exploite PAR pour améliorer l'apprentissage des politiques, démontrant son efficacité par rapport aux algorithmes traditionnels.

A new study on offline multi-agent reinforcement learning (MARL) introduces Partial Action Replacement (PAR) as a solution to the challenge of evaluating out-of-distribution joint actions. This method allows for updating only some agents' actions while keeping others fixed, significantly reducing distribution shift. The research also presents Soft-Partial Conservative Q-Learning (SPaCQL), which leverages PAR to improve policy learning, demonstrating effectiveness over traditional algorithms.

Partial Action Replacement: Tackling Distribution Shift in Offline MARL

Here's everything you need to know about the latest smart ring by Oura, based on a wearable expert's real-world usage.

Is the $500 Oura Ring 4 Ceramic worth it? I wore one for a month, and here's my advice

Samsung's QN90F QLED delivers great streaming and gaming performance, making it a strong holiday value.

Why I recommend this Samsung QLED TV over pricier OLED models in 2025 - and don't regret it

<h1>
 
 
 AI Amnesia: Erasing Knowledge Without a Trace
</h1>

Imagine your AI model accidentally learned something it shouldn't have – sensitive customer data, for example. Current methods for deleting this information often require retraining the entire model, an expensive and time-consuming process. What if we could surgically remove that knowledge without starting from scratch?

The key lies in a novel approach: creating artificial "forgetting cues." We're teaching the model to unlearn specific data patterns by exposing it to carefully crafted synthetic examples. These examples are designed to strongly contradict the information we want the model to forget, effectively overwriting the problematic associations in its memory. This works even if you don’t have access to the original data you need to erase.

Think of it like this: you're trying to forget a bad song stuck in your head. Instead of trying to actively suppress it (which rarely works), you blast an even more catchy song. The new song overwrites the old one, effectively erasing it from your mental playlist.

Benefits of Selective Forgetting:

<ul>
<li> Enhanced Data Privacy: Remove sensitive data without compromising the overall model's performance.</li>
<li> Reduced Retraining Costs: Avoid full model retraining, saving significant time and resources.</li>
<li> Improved Model Security: Eliminate vulnerabilities introduced by unintentionally learned patterns.</li>
<li> Adaptable Learning: Enables continuous refinement of AI models based on evolving data landscapes.</li>
<li> Compliance Ready: Supports compliance with data privacy regulations like GDPR.</li>
<li> Scalable Solutions: Works efficiently even with limited access to training data.</li>
</ul>

Practical Tip: One challenge is ensuring the synthetic data accurately targets the information you want to remove without negatively impacting the model's ability to generalize. Rigorous testing and validation with holdout datasets are crucial.

The promise of AI that can truly 'forget' opens exciting possibilities for responsible AI development. By enabling precise data deletion, we pave the way for more secure, compliant, and adaptable machine learning systems. Imagine AI models that can adapt to changing ethical guidelines or quickly unlearn incorrect information, all without massive retraining efforts. This is a crucial step towards trustworthy and responsible AI that respects data privacy and aligns with societal values. Future exploration could include extending this to different data modalities and model architectures.

Related Keywords: machine unlearning, data privacy, few-shot learning, zero-shot learning, synthetic data, model editing, catastrophic forgetting, incremental learning, continual learning, deep learning, neural networks, data security, algorithmic fairness, responsible ai, ethical ai, federated unlearning, privacy-preserving ai, model retraining, AI governance, data deletion, GDPR compliance

يتناول المقال نهجًا جديدًا لمعالجة مشكلة نسيان الذكاء الاصطناعي، حيث قد تحتفظ نماذج الذكاء الاصطناعي عن غير قصد بمعلومات حساسة. تتطلب الطرق التقليدية لحذف هذه البيانات غالبًا إعادة تدريب كاملة للنموذج، وهو ما يعد مكلفًا ويستغرق وقتًا. تتضمن التقنية الجديدة إنشاء 'إشارات نسيان' اصطناعية تساعد النموذج على نسيان أنماط بيانات معينة من خلال تعريضه لأمثلة صناعية تتناقض مع المعلومات غير المرغوب فيها، مما يسمح بإزالة المعرفة المستهدفة دون الحاجة للوصول إلى البيانات الأصلية.

El artículo aborda un nuevo enfoque para tratar el problema de la amnesia de la IA, donde los modelos de IA pueden retener inadvertidamente información sensible. Los métodos tradicionales para eliminar estos datos a menudo requieren un reentrenamiento completo del modelo, lo que es costoso y lleva tiempo. La nueva técnica implica crear 'señales de olvido' artificiales que ayudan al modelo a desaprender patrones de datos específicos al exponerlo a ejemplos sintéticos que contradicen la información no deseada, permitiendo así la eliminación selectiva de conocimientos sin necesidad de acceder a l…

L'article traite d'une nouvelle approche pour résoudre le problème de l'amnésie de l'IA, où les modèles d'IA peuvent involontairement conserver des informations sensibles. Les méthodes traditionnelles pour supprimer ces données nécessitent souvent un réentraînement complet du modèle, ce qui est coûteux et long. La nouvelle technique consiste à créer des 'cues d'oubli' artificiels qui aident le modèle à désapprendre des schémas de données spécifiques en lui présentant des exemples synthétiques qui contredisent les informations indésirables, permettant ainsi une suppression ciblée des connaissan…

The article discusses a novel approach to address the issue of AI amnesia, where AI models may inadvertently retain sensitive information. Traditional methods for deleting such data often require complete retraining of the model, which is costly and time-consuming. The new technique involves creating artificial 'forgetting cues' that help the model unlearn specific data patterns by presenting it with synthetic examples that contradict the unwanted information, allowing for targeted knowledge removal without needing access to the original data.

AI Amnesia: Erasing Knowledge Without a Trace

Baseus' Enercore CG11 travel adapter is one of the better designed ones I've tested, although its best feature isn't immediately apparent.

Traveling soon? Why this one charger is the only one you'll ever need to pack

Turning materials like wood chips, crop residues and municipal solid waste into fuels and chemicals is important for our country's energy independence.

طور الباحثون نماذج حاسوبية متقدمة تهدف إلى تحسين التنبؤات لعمليات طحن الكتلة الحيوية. تركز هذه الابتكارات على تحويل مواد مثل رقائق الخشب، بقايا المحاصيل، والنفايات الصلبة البلدية إلى وقود ومواد كيميائية قيمة، وهو أمر مهم لتعزيز استقلالية الطاقة في البلاد. من المتوقع أن تعمل النماذج على تحسين كفاءة معالجة الكتلة الحيوية، مما يساهم في حلول الطاقة المستدامة.

Investigadores han desarrollado modelos informáticos avanzados destinados a mejorar las predicciones para los procesos de molienda de biomasa. Esta innovación se centra en convertir materiales como astillas de madera, residuos de cultivos y residuos sólidos municipales en combustibles y productos químicos valiosos, lo que es crucial para mejorar la independencia energética del país. Se espera que los modelos optimicen la eficiencia del procesamiento de biomasa, contribuyendo así a soluciones energéticas sostenibles.

Des chercheurs ont développé des modèles informatiques avancés visant à améliorer les prévisions pour les processus de broyage de la biomasse. Cette innovation se concentre sur la conversion de matériaux tels que les copeaux de bois, les résidus de culture et les déchets solides municipaux en combustibles et produits chimiques précieux, ce qui est crucial pour renforcer l'indépendance énergétique du pays. Les modèles devraient optimiser l'efficacité du traitement de la biomasse, contribuant ainsi à des solutions énergétiques durables.

Researchers have developed advanced computer models aimed at improving predictions for biomass milling processes. This innovation focuses on converting materials such as wood chips, crop residues, and municipal solid waste into valuable fuels and chemicals, which is crucial for enhancing energy independence in the country. The models are expected to optimize the efficiency of biomass processing, thereby contributing to sustainable energy solutions.

Researchers develop computer models for better biomass milling predictions

The most interesting aspect of the latest Even Realities glasses may be their limitations.

I've tried several AI smart glasses (including Meta Ray-Bans) in 2025 - these are the most comfortable

arXiv:2511.10809v2 Announce Type: replace 
Abstract: Linear Predictive Clustering (LPC) partitions samples based on shared linear relationships between feature and target variables, with numerous applications including marketing, medicine, and education. Greedy optimization methods, commonly used for LPC, alternate between clustering and linear regression but lack global optimality. While effective for separable clusters, they struggle in non-separable settings where clusters overlap in feature space. In an alternative constrained optimization paradigm, Bertsimas and Shioda (2007) formulated LPC as a Mixed-Integer Program (MIP), ensuring global optimality regardless of separability but suffering from poor scalability. This work builds on the constrained optimization paradigm to introduce two novel approaches that improve the efficiency of global optimization for LPC. By leveraging key theoretical properties of separability, we derive near-optimal approximations with provable error bounds, significantly reducing the MIP formulation's complexity and improving scalability. Additionally, we can further approximate LPC as a Quadratic Pseudo-Boolean Optimization (QPBO) problem, achieving substantial computational improvements in some settings. Comparative analyses on synthetic and real-world datasets demonstrate that our methods consistently achieve near-optimal solutions with substantially lower regression errors than greedy optimization while exhibiting superior scalability over existing MIP formulations.

Near-optimal Linear Predictive Clustering in Non-separable Spaces via Mixed Integer Programming and Quadratic Pseudo-Boolean Reductions

arXiv:2511.07947v2 Announce Type: replace-cross 
Abstract: Machine learning models constitute valuable intellectual property, yet remain vulnerable to model extraction attacks (MEA), where adversaries replicate their functionality through black-box queries. Model watermarking counters MEAs by embedding forensic markers for ownership verification. Current black-box watermarks prioritize MEA survival through representation entanglement, yet inadequately explore resilience against sequential MEAs and removal attacks. Our study reveals that this risk is underestimated because existing removal methods are weakened by entanglement. To address this gap, we propose Watermark Removal attacK (WRK), which circumvents entanglement constraints by exploiting decision boundaries shaped by prevailing sample-level watermark artifacts. WRK effectively reduces watermark success rates by at least 88.79% across existing watermarking benchmarks.
  For robust protection, we propose Class-Feature Watermarks (CFW), which improve resilience by leveraging class-level artifacts. CFW constructs a synthetic class using out-of-domain samples, eliminating vulnerable decision boundaries between original domain samples and their artifact-modified counterparts (watermark samples). CFW concurrently optimizes both MEA transferability and post-MEA stability. Experiments across multiple domains show that CFW consistently outperforms prior methods in resilience, maintaining a watermark success rate of at least 70.15% in extracted models even under the combined MEA and WRK distortion, while preserving the utility of protected models.

Class-feature Watermark: A Resilient Black-box Watermark Against Model Extraction Attacks

arXiv:2511.06854v2 Announce Type: replace-cross 
Abstract: Irregularly sampled time series (ISTS), characterized by non-uniform time intervals with natural missingness, are prevalent in real-world applications. Existing approaches for ISTS modeling primarily rely on observed values to impute unobserved ones or infer latent dynamics. However, these methods overlook a critical source of learning signal: the reconstruction error inherently produced during model training. Such error implicitly reflects how well a model captures the underlying data structure and can serve as an informative proxy for unobserved values. To exploit this insight, we propose iTimER, a simple yet effective self-supervised pre-training framework for ISTS representation learning. iTimER models the distribution of reconstruction errors over observed values and generates pseudo-observations for unobserved timestamps through a mixup strategy between sampled errors and the last available observations. This transforms unobserved timestamps into noise-aware training targets, enabling meaningful reconstruction signals. A Wasserstein metric aligns reconstruction error distributions between observed and pseudo-observed regions, while a contrastive learning objective enhances the discriminability of learned representations. Extensive experiments on classification, interpolation, and forecasting tasks demonstrate that iTimER consistently outperforms state-of-the-art methods under the ISTS setting.

يقدم المقال iTimER، وهو إطار عمل للتدريب الذاتي الموجه مصمم لتعلم تمثيلات السلاسل الزمنية المأخوذة بشكل غير منتظم (ISTS). تتميز ISTS بفترات زمنية غير متساوية وغياب طبيعي، مما يشكل تحديات أمام أساليب النمذجة التقليدية. يستفيد iTimER من خطأ إعادة البناء الناتج أثناء تدريب النموذج لإنشاء ملاحظات زائفة للعلامات الزمنية غير الملاحظة، مما يعزز قدرة النموذج على التقاط الهياكل الأساسية للبيانات وتحسين دقة التنبؤات.

El artículo presenta iTimER, un marco de preentrenamiento auto-supervisado diseñado para aprender representaciones de series temporales muestreadas de manera irregular (ISTS). Las ISTS se caracterizan por intervalos de tiempo no uniformes y faltantes naturales, lo que plantea desafíos para los enfoques de modelado tradicionales. iTimER aprovecha el error de reconstrucción generado durante el entrenamiento del modelo para crear pseudo-observaciones para marcas de tiempo no observadas, mejorando la capacidad del modelo para capturar las estructuras subyacentes de los datos y mejorar la precisión…

L'article présente iTimER, un cadre de pré-entraînement auto-supervisé conçu pour apprendre des représentations de séries temporelles échantillonnées de manière irrégulière (ISTS). Les ISTS se caractérisent par des intervalles de temps non uniformes et des manques naturels, ce qui pose des défis pour les approches de modélisation traditionnelles. iTimER exploite l'erreur de reconstruction générée pendant l'entraînement du modèle pour créer des pseudo-observations pour les horodatages non observés, améliorant ainsi la capacité du modèle à capturer les structures sous-jacentes des données et à a…

The article presents iTimER, a self-supervised pre-training framework designed for learning representations of irregularly sampled time series (ISTS). ISTS are characterized by non-uniform time intervals and natural missingness, which pose challenges for traditional modeling approaches. iTimER leverages the reconstruction error generated during model training to create pseudo-observations for unobserved timestamps, enhancing the model's ability to capture underlying data structures and improve forecasting accuracy.

Partial Action Replacement: Tackling Distribution Shift in Offline MARL

Was this article worth reading? Share it