World PulseNowPowered by AI

Trending:

Iterated Population Based Training with Task-Agnostic Restarts

arXiv — cs.LG•Thursday, November 13, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

The recent introduction of Iterated Population Based Training (IPBT) marks a significant advancement in hyperparameter optimization (HPO) for neural networks. This novel approach dynamically adjusts hyperparameters through task-agnostic restarts, utilizing time-varying Bayesian optimization to reinitialize settings. Evaluations across eight image classification and reinforcement learning tasks demonstrate that IPBT matches or outperforms five previous PBT variants and other HPO methods, such as random search and ASHA, without necessitating additional resources or changes to hyperparameters. The importance of hyperparameter updates is underscored, as the frequency of these updates can greatly influence performance. The source code for IPBT is publicly available, promoting further research and application in the field of artificial intelligence.

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings

Mitigating Negative Flips via Margin Preserving Training

arXiv — cs.LG17 hours ago

Mitigating Negative Flips via Margin Preserving Training

PositiveArtificial Intelligence

Minimizing inconsistencies in AI systems is crucial for reducing overall error rates. In image classification, negative flips occur when updated models misclassify previously correctly classified samples. This issue intensifies with the addition of new training classes, which can reduce the margin between classes and introduce conflicting patterns. To address this, a novel approach is proposed that preserves the original model's margins while improving performance, utilizing a margin-calibration term to enhance class separation.

Read full article

via arXiv — cs.LG

arXiv — cs.LG2 days ago

Potent but Stealthy: Rethink Profile Pollution against Sequential Recommendation via Bi-level Constrained Reinforcement Paradigm

PositiveArtificial Intelligence

The paper titled 'Potent but Stealthy: Rethink Profile Pollution against Sequential Recommendation via Bi-level Constrained Reinforcement Paradigm' addresses vulnerabilities in sequential recommenders, particularly to adversarial attacks. It highlights the Profile Pollution Attack (PPA), which subtly contaminates user interactions to induce mispredictions. The authors propose a new method called CREAT, which combines bi-level optimization with reinforcement learning to enhance the stealthiness and effectiveness of such attacks, overcoming limitations of previous methods.

Read full article

via arXiv — cs.LG

DiAReL: Reinforcement Learning with Disturbance Awareness for Robust Sim2Real Policy Transfer in Robot Control

arXiv — cs.LG2 days ago

DiAReL: Reinforcement Learning with Disturbance Awareness for Robust Sim2Real Policy Transfer in Robot Control

PositiveArtificial Intelligence

The paper titled 'DiAReL: Reinforcement Learning with Disturbance Awareness for Robust Sim2Real Policy Transfer in Robot Control' discusses the introduction of a disturbance-augmented Markov decision process (DAMDP) to enhance reinforcement learning in robotic control. It addresses the challenges of sim2real transfer, where models trained in simulation often fail to perform effectively in real-world scenarios due to discrepancies in system dynamics. The proposed approach aims to improve the robustness and stabilization of control responses in robotic systems.

Read full article

via arXiv — cs.LG

Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction

arXiv — cs.CL2 days ago

Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction

PositiveArtificial Intelligence

The article presents Thinker, a hierarchical thinking model designed to enhance the reasoning capabilities of large language models (LLMs) through multi-turn interactions. Unlike previous methods that relied on end-to-end reinforcement learning without supervision, Thinker allows for a more structured reasoning process by breaking down complex problems into manageable sub-problems. Each sub-problem is represented in both natural language and logical functions, improving the coherence and rigor of the reasoning process.

Read full article

via arXiv — cs.CL

LDC: Learning to Generate Research Idea with Dynamic Control

arXiv — cs.CL2 days ago

LDC: Learning to Generate Research Idea with Dynamic Control

PositiveArtificial Intelligence

Recent advancements in large language models (LLMs) highlight their potential in automating scientific research ideation. Current methods often produce ideas that do not meet expert standards of novelty, feasibility, and effectiveness. To address these issues, a new framework is proposed that combines Supervised Fine-Tuning (SFT) and controllable Reinforcement Learning (RL) to enhance the quality of generated research ideas through a two-stage approach.

Read full article

via arXiv — cs.CL

SemanticNN: Compressive and Error-Resilient Semantic Offloading for Extremely Weak Devices

arXiv — cs.CV2 days ago

SemanticNN: Compressive and Error-Resilient Semantic Offloading for Extremely Weak Devices

PositiveArtificial Intelligence

The article presents SemanticNN, a novel semantic codec designed for extremely weak embedded devices in the Internet of Things (IoT). It addresses the challenges of integrating artificial intelligence (AI) on such devices, which often face resource limitations and unreliable network conditions. SemanticNN focuses on achieving semantic-level correctness despite bit-level errors, utilizing a Bit Error Rate (BER)-aware decoder and a Soft Quantization (SQ)-based encoder to enhance collaborative inference offloading.

Read full article

via arXiv — cs.CV

LampQ: Towards Accurate Layer-wise Mixed Precision Quantization for Vision Transformers

arXiv — cs.CV2 days ago

LampQ: Towards Accurate Layer-wise Mixed Precision Quantization for Vision Transformers

PositiveArtificial Intelligence

The paper titled 'LampQ: Towards Accurate Layer-wise Mixed Precision Quantization for Vision Transformers' presents a new method for quantizing pre-trained Vision Transformer models. The proposed Layer-wise Mixed Precision Quantization (LampQ) addresses limitations in existing quantization methods, such as coarse granularity and metric scale mismatches. By employing a type-aware Fisher-based metric, LampQ aims to enhance both the efficiency and accuracy of quantization in various tasks, including image classification and object detection.

Read full article

via arXiv — cs.CV

Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning

arXiv — cs.LG2 days ago

Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning

PositiveArtificial Intelligence

The paper titled 'Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning' addresses the challenges of high-variance return estimates in reinforcement learning algorithms. It highlights that well-designed behavior policies can collect off-policy data, leading to lower variance return estimates. This finding suggests that on-policy data collection is not optimal for variance, and the authors extend this insight to online reinforcement learning, where policy evaluation and improvement occur simultaneously.

Read full article

via arXiv — cs.LG