arXiv:2511.07572v1 Announce Type: new 
Abstract: Mechanistic interpretability aims to decompose neural networks into interpretable features and map their connecting circuits. The standard approach trains sparse autoencoders (SAEs) on each layer's activations. However, SAEs trained in isolation don't encourage sparse cross-layer connections, inflating extracted circuits where upstream features needlessly affect multiple downstream features. Current evaluations focus on individual SAE performance, leaving interaction sparsity unexamined. We introduce SCALAR (Sparse Connectivity Assessment of Latent Activation Relationships), a benchmark measuring interaction sparsity between SAE features. We also propose "Staircase SAEs", using weight-sharing to limit upstream feature duplication across downstream features. Using SCALAR, we compare TopK SAEs, Jacobian SAEs (JSAEs), and Staircase SAEs. Staircase SAEs improve relative sparsity over TopK SAEs by $59.67\% \pm 1.83\%$ (feedforward) and $63.15\% \pm 1.35\%$ (transformer blocks). JSAEs provide $8.54\% \pm 0.38\%$ improvement over TopK for feedforward layers but cannot train effectively across transformer blocks, unlike Staircase and TopK SAEs which work anywhere in the residual stream. We validate on a $216$K-parameter toy model and GPT-$2$ Small ($124$M), where Staircase SAEs maintain interaction sparsity improvements while preserving feature interpretability. Our work highlights the importance of interaction sparsity in SAEs through benchmarking and comparing promising architectures.

تقديم SCALAR، وهو معيار لتقييم ندرة التفاعل في المشفرات التلقائية النادرة (SAEs)، يمثل تقدمًا كبيرًا في قابلية تفسير الشبكات العصبية. من خلال مقارنة نماذج SAE المختلفة، بما في ذلك Staircase SAEs، التي أظهرت تحسينًا بنسبة 59.67% في الندرة، تبرز هذه الأبحاث أهمية ندرة التفاعل في تعزيز كفاءة وفهم النماذج.

La introducción de SCALAR, un benchmark para evaluar la escasez de interacción en autoencoders dispersos (SAEs), representa un avance significativo en la interpretabilidad mecanicista de las redes neuronales. Al comparar diferentes modelos de SAE, incluidos los Staircase SAEs, que mostraron una mejora del 59.67% en la escasez, esta investigación destaca la importancia de la escasez de interacción para mejorar la eficiencia y comprensión de los modelos.

L'introduction de SCALAR, un benchmark pour évaluer la sparsité d'interaction dans les autoencodeurs spars (SAEs), représente une avancée significative dans l'interprétabilité mécaniste des réseaux de neurones. En comparant différents modèles de SAE, y compris les Staircase SAEs, qui ont montré une amélioration de 59,67 % en sparsité, cette recherche souligne l'importance de la sparsité d'interaction pour améliorer l'efficacité et la compréhension des modèles.

The introduction of SCALAR, a benchmark for assessing interaction sparsity in sparse autoencoders (SAEs), marks a significant advancement in mechanistic interpretability of neural networks. By comparing different SAE models, including Staircase SAEs, which showed a 59.67% improvement in sparsity, this research highlights the importance of interaction sparsity in enhancing model efficiency and understanding.

SCALAR: Benchmarking SAE Interaction Sparsity in Toy LLMs

Here's everything you need to know about the latest smart ring by Oura, based on a wearable expert's real-world usage.

Is the $500 Oura Ring 4 Ceramic worth it? I wore one for a month, and here's my advice

Samsung's QN90F QLED delivers great streaming and gaming performance, making it a strong holiday value.

Why I recommend this Samsung QLED TV over pricier OLED models in 2025 - and don't regret it

<h1>
 
 
 AI Amnesia: Erasing Knowledge Without a Trace
</h1>

Imagine your AI model accidentally learned something it shouldn't have – sensitive customer data, for example. Current methods for deleting this information often require retraining the entire model, an expensive and time-consuming process. What if we could surgically remove that knowledge without starting from scratch?

The key lies in a novel approach: creating artificial "forgetting cues." We're teaching the model to unlearn specific data patterns by exposing it to carefully crafted synthetic examples. These examples are designed to strongly contradict the information we want the model to forget, effectively overwriting the problematic associations in its memory. This works even if you don’t have access to the original data you need to erase.

Think of it like this: you're trying to forget a bad song stuck in your head. Instead of trying to actively suppress it (which rarely works), you blast an even more catchy song. The new song overwrites the old one, effectively erasing it from your mental playlist.

Benefits of Selective Forgetting:

<ul>
<li> Enhanced Data Privacy: Remove sensitive data without compromising the overall model's performance.</li>
<li> Reduced Retraining Costs: Avoid full model retraining, saving significant time and resources.</li>
<li> Improved Model Security: Eliminate vulnerabilities introduced by unintentionally learned patterns.</li>
<li> Adaptable Learning: Enables continuous refinement of AI models based on evolving data landscapes.</li>
<li> Compliance Ready: Supports compliance with data privacy regulations like GDPR.</li>
<li> Scalable Solutions: Works efficiently even with limited access to training data.</li>
</ul>

Practical Tip: One challenge is ensuring the synthetic data accurately targets the information you want to remove without negatively impacting the model's ability to generalize. Rigorous testing and validation with holdout datasets are crucial.

The promise of AI that can truly 'forget' opens exciting possibilities for responsible AI development. By enabling precise data deletion, we pave the way for more secure, compliant, and adaptable machine learning systems. Imagine AI models that can adapt to changing ethical guidelines or quickly unlearn incorrect information, all without massive retraining efforts. This is a crucial step towards trustworthy and responsible AI that respects data privacy and aligns with societal values. Future exploration could include extending this to different data modalities and model architectures.

Related Keywords: machine unlearning, data privacy, few-shot learning, zero-shot learning, synthetic data, model editing, catastrophic forgetting, incremental learning, continual learning, deep learning, neural networks, data security, algorithmic fairness, responsible ai, ethical ai, federated unlearning, privacy-preserving ai, model retraining, AI governance, data deletion, GDPR compliance

يتناول المقال نهجًا جديدًا لمعالجة مشكلة نسيان الذكاء الاصطناعي، حيث قد تحتفظ نماذج الذكاء الاصطناعي عن غير قصد بمعلومات حساسة. تتطلب الطرق التقليدية لحذف هذه البيانات غالبًا إعادة تدريب كاملة للنموذج، وهو ما يعد مكلفًا ويستغرق وقتًا. تتضمن التقنية الجديدة إنشاء 'إشارات نسيان' اصطناعية تساعد النموذج على نسيان أنماط بيانات معينة من خلال تعريضه لأمثلة صناعية تتناقض مع المعلومات غير المرغوب فيها، مما يسمح بإزالة المعرفة المستهدفة دون الحاجة للوصول إلى البيانات الأصلية.

El artículo aborda un nuevo enfoque para tratar el problema de la amnesia de la IA, donde los modelos de IA pueden retener inadvertidamente información sensible. Los métodos tradicionales para eliminar estos datos a menudo requieren un reentrenamiento completo del modelo, lo que es costoso y lleva tiempo. La nueva técnica implica crear 'señales de olvido' artificiales que ayudan al modelo a desaprender patrones de datos específicos al exponerlo a ejemplos sintéticos que contradicen la información no deseada, permitiendo así la eliminación selectiva de conocimientos sin necesidad de acceder a l…

L'article traite d'une nouvelle approche pour résoudre le problème de l'amnésie de l'IA, où les modèles d'IA peuvent involontairement conserver des informations sensibles. Les méthodes traditionnelles pour supprimer ces données nécessitent souvent un réentraînement complet du modèle, ce qui est coûteux et long. La nouvelle technique consiste à créer des 'cues d'oubli' artificiels qui aident le modèle à désapprendre des schémas de données spécifiques en lui présentant des exemples synthétiques qui contredisent les informations indésirables, permettant ainsi une suppression ciblée des connaissan…

The article discusses a novel approach to address the issue of AI amnesia, where AI models may inadvertently retain sensitive information. Traditional methods for deleting such data often require complete retraining of the model, which is costly and time-consuming. The new technique involves creating artificial 'forgetting cues' that help the model unlearn specific data patterns by presenting it with synthetic examples that contradict the unwanted information, allowing for targeted knowledge removal without needing access to the original data.

AI Amnesia: Erasing Knowledge Without a Trace

Baseus' Enercore CG11 travel adapter is one of the better designed ones I've tested, although its best feature isn't immediately apparent.

Traveling soon? Why this one charger is the only one you'll ever need to pack

Turning materials like wood chips, crop residues and municipal solid waste into fuels and chemicals is important for our country's energy independence.

طور الباحثون نماذج حاسوبية متقدمة تهدف إلى تحسين التنبؤات لعمليات طحن الكتلة الحيوية. تركز هذه الابتكارات على تحويل مواد مثل رقائق الخشب، بقايا المحاصيل، والنفايات الصلبة البلدية إلى وقود ومواد كيميائية قيمة، وهو أمر مهم لتعزيز استقلالية الطاقة في البلاد. من المتوقع أن تعمل النماذج على تحسين كفاءة معالجة الكتلة الحيوية، مما يساهم في حلول الطاقة المستدامة.

Investigadores han desarrollado modelos informáticos avanzados destinados a mejorar las predicciones para los procesos de molienda de biomasa. Esta innovación se centra en convertir materiales como astillas de madera, residuos de cultivos y residuos sólidos municipales en combustibles y productos químicos valiosos, lo que es crucial para mejorar la independencia energética del país. Se espera que los modelos optimicen la eficiencia del procesamiento de biomasa, contribuyendo así a soluciones energéticas sostenibles.

Des chercheurs ont développé des modèles informatiques avancés visant à améliorer les prévisions pour les processus de broyage de la biomasse. Cette innovation se concentre sur la conversion de matériaux tels que les copeaux de bois, les résidus de culture et les déchets solides municipaux en combustibles et produits chimiques précieux, ce qui est crucial pour renforcer l'indépendance énergétique du pays. Les modèles devraient optimiser l'efficacité du traitement de la biomasse, contribuant ainsi à des solutions énergétiques durables.

Researchers have developed advanced computer models aimed at improving predictions for biomass milling processes. This innovation focuses on converting materials such as wood chips, crop residues, and municipal solid waste into valuable fuels and chemicals, which is crucial for enhancing energy independence in the country. The models are expected to optimize the efficiency of biomass processing, thereby contributing to sustainable energy solutions.

Researchers develop computer models for better biomass milling predictions

The most interesting aspect of the latest Even Realities glasses may be their limitations.

I've tried several AI smart glasses (including Meta Ray-Bans) in 2025 - these are the most comfortable

arXiv:2511.10809v2 Announce Type: replace 
Abstract: Linear Predictive Clustering (LPC) partitions samples based on shared linear relationships between feature and target variables, with numerous applications including marketing, medicine, and education. Greedy optimization methods, commonly used for LPC, alternate between clustering and linear regression but lack global optimality. While effective for separable clusters, they struggle in non-separable settings where clusters overlap in feature space. In an alternative constrained optimization paradigm, Bertsimas and Shioda (2007) formulated LPC as a Mixed-Integer Program (MIP), ensuring global optimality regardless of separability but suffering from poor scalability. This work builds on the constrained optimization paradigm to introduce two novel approaches that improve the efficiency of global optimization for LPC. By leveraging key theoretical properties of separability, we derive near-optimal approximations with provable error bounds, significantly reducing the MIP formulation's complexity and improving scalability. Additionally, we can further approximate LPC as a Quadratic Pseudo-Boolean Optimization (QPBO) problem, achieving substantial computational improvements in some settings. Comparative analyses on synthetic and real-world datasets demonstrate that our methods consistently achieve near-optimal solutions with substantially lower regression errors than greedy optimization while exhibiting superior scalability over existing MIP formulations.

Near-optimal Linear Predictive Clustering in Non-separable Spaces via Mixed Integer Programming and Quadratic Pseudo-Boolean Reductions

arXiv:2511.07947v2 Announce Type: replace-cross 
Abstract: Machine learning models constitute valuable intellectual property, yet remain vulnerable to model extraction attacks (MEA), where adversaries replicate their functionality through black-box queries. Model watermarking counters MEAs by embedding forensic markers for ownership verification. Current black-box watermarks prioritize MEA survival through representation entanglement, yet inadequately explore resilience against sequential MEAs and removal attacks. Our study reveals that this risk is underestimated because existing removal methods are weakened by entanglement. To address this gap, we propose Watermark Removal attacK (WRK), which circumvents entanglement constraints by exploiting decision boundaries shaped by prevailing sample-level watermark artifacts. WRK effectively reduces watermark success rates by at least 88.79% across existing watermarking benchmarks.
  For robust protection, we propose Class-Feature Watermarks (CFW), which improve resilience by leveraging class-level artifacts. CFW constructs a synthetic class using out-of-domain samples, eliminating vulnerable decision boundaries between original domain samples and their artifact-modified counterparts (watermark samples). CFW concurrently optimizes both MEA transferability and post-MEA stability. Experiments across multiple domains show that CFW consistently outperforms prior methods in resilience, maintaining a watermark success rate of at least 70.15% in extracted models even under the combined MEA and WRK distortion, while preserving the utility of protected models.

Class-feature Watermark: A Resilient Black-box Watermark Against Model Extraction Attacks

arXiv:2511.06854v2 Announce Type: replace-cross 
Abstract: Irregularly sampled time series (ISTS), characterized by non-uniform time intervals with natural missingness, are prevalent in real-world applications. Existing approaches for ISTS modeling primarily rely on observed values to impute unobserved ones or infer latent dynamics. However, these methods overlook a critical source of learning signal: the reconstruction error inherently produced during model training. Such error implicitly reflects how well a model captures the underlying data structure and can serve as an informative proxy for unobserved values. To exploit this insight, we propose iTimER, a simple yet effective self-supervised pre-training framework for ISTS representation learning. iTimER models the distribution of reconstruction errors over observed values and generates pseudo-observations for unobserved timestamps through a mixup strategy between sampled errors and the last available observations. This transforms unobserved timestamps into noise-aware training targets, enabling meaningful reconstruction signals. A Wasserstein metric aligns reconstruction error distributions between observed and pseudo-observed regions, while a contrastive learning objective enhances the discriminability of learned representations. Extensive experiments on classification, interpolation, and forecasting tasks demonstrate that iTimER consistently outperforms state-of-the-art methods under the ISTS setting.

يقدم المقال iTimER، وهو إطار عمل للتدريب الذاتي الموجه مصمم لتعلم تمثيلات السلاسل الزمنية المأخوذة بشكل غير منتظم (ISTS). تتميز ISTS بفترات زمنية غير متساوية وغياب طبيعي، مما يشكل تحديات أمام أساليب النمذجة التقليدية. يستفيد iTimER من خطأ إعادة البناء الناتج أثناء تدريب النموذج لإنشاء ملاحظات زائفة للعلامات الزمنية غير الملاحظة، مما يعزز قدرة النموذج على التقاط الهياكل الأساسية للبيانات وتحسين دقة التنبؤات.

El artículo presenta iTimER, un marco de preentrenamiento auto-supervisado diseñado para aprender representaciones de series temporales muestreadas de manera irregular (ISTS). Las ISTS se caracterizan por intervalos de tiempo no uniformes y faltantes naturales, lo que plantea desafíos para los enfoques de modelado tradicionales. iTimER aprovecha el error de reconstrucción generado durante el entrenamiento del modelo para crear pseudo-observaciones para marcas de tiempo no observadas, mejorando la capacidad del modelo para capturar las estructuras subyacentes de los datos y mejorar la precisión…

L'article présente iTimER, un cadre de pré-entraînement auto-supervisé conçu pour apprendre des représentations de séries temporelles échantillonnées de manière irrégulière (ISTS). Les ISTS se caractérisent par des intervalles de temps non uniformes et des manques naturels, ce qui pose des défis pour les approches de modélisation traditionnelles. iTimER exploite l'erreur de reconstruction générée pendant l'entraînement du modèle pour créer des pseudo-observations pour les horodatages non observés, améliorant ainsi la capacité du modèle à capturer les structures sous-jacentes des données et à a…

The article presents iTimER, a self-supervised pre-training framework designed for learning representations of irregularly sampled time series (ISTS). ISTS are characterized by non-uniform time intervals and natural missingness, which pose challenges for traditional modeling approaches. iTimER leverages the reconstruction error generated during model training to create pseudo-observations for unobserved timestamps, enhancing the model's ability to capture underlying data structures and improve forecasting accuracy.

SCALAR: Benchmarking SAE Interaction Sparsity in Toy LLMs

Was this article worth reading? Share it