arXiv:2512.00517v2 Announce Type: replace 
Abstract: Sequential optimization of black-box functions from noisy evaluations has been widely studied, with Gaussian Process bandit algorithms such as GP-UCB guaranteeing no-regret in stationary settings. However, for time-varying objectives, it is known that no-regret is unattainable under pure bandit feedback unless strong and often unrealistic assumptions are imposed.
  In this article, we propose a novel method to optimize time-varying rewards in the frequentist setting, where the objective has bounded RKHS norm. Time variations are captured through uncertainty injection (UI), which enables heteroscedastic GP regression that adapts past observations to the current time step. As no-regret is unattainable in general in the strict bandit setting, we relax the latter allowing additional queries on previously observed points. Building on sparse inference and the effect of UI on regret, we propose W-SparQ-GP-UCB, an online algorithm that achieves no-regret with only a vanishing number of additional queries per iteration. To assess the theoretical limits of this approach, we establish a lower bound on the number of additional queries required for no-regret, proving the efficiency of our method. Finally, we provide a comprehensive analysis linking the degree of time-variation of the function to achievable regret rates, together with upper and lower bounds on the number of additional queries needed in each regime.

تم اقتراح طريقة جديدة لتحسين المكافآت المتغيرة مع الزمن باستخدام خوارزميات عملية غاوسية، مما يعالج القيود المتعلقة بتحقيق عدم الندم تحت تغذية راجعة من نوع الباندت الصرف. تتضمن الطريقة، المسماة W-SparQ-GP-UCB، حقن عدم اليقين لتكييف الملاحظات السابقة مع الظروف الحالية، مما يسمح بتحسين أكثر فعالية في البيئات الديناميكية.

Se ha propuesto un nuevo método para optimizar recompensas que varían en el tiempo utilizando algoritmos de bandido de Proceso Gaussiano, abordando las limitaciones de lograr un arrepentimiento nulo bajo retroalimentación de bandido puro. El método, denominado W-SparQ-GP-UCB, incorpora inyección de incertidumbre para adaptar observaciones pasadas a las condiciones actuales, permitiendo una optimización más efectiva en entornos dinámicos.

Une nouvelle méthode d'optimisation des récompenses variant dans le temps utilisant des algorithmes de bandit de processus gaussien a été proposée, abordant les limites de l'atteinte d'un regret nul sous un retour de bandit pur. La méthode, appelée W-SparQ-GP-UCB, intègre une injection d'incertitude pour adapter les observations passées aux conditions actuelles, permettant une optimisation plus efficace dans des environnements dynamiques.

A novel method for optimizing time-varying rewards using Gaussian Process bandit algorithms has been proposed, addressing the limitations of achieving no-regret under pure bandit feedback. The method, termed W-SparQ-GP-UCB, incorporates uncertainty injection to adapt past observations to current conditions, allowing for more effective optimization in dynamic environments.

No-Regret Gaussian Process Optimization of Time-Varying Functions

arXiv:2512.08879v1 Announce Type: new 
Abstract: Real-world datasets often exhibit temporal dynamics characterized by evolving data distributions. Disregarding this phenomenon, commonly referred to as concept drift, can significantly diminish a model's predictive accuracy. Furthermore, the presence of hyperparameters in online models exacerbates this issue. These parameters are typically fixed and cannot be dynamically adjusted by the user in response to the evolving data distribution. Gaussian Process (GP) models offer powerful non-parametric regression capabilities with uncertainty quantification, making them ideal for modeling complex data relationships in an online setting. However, conventional online GP methods face several critical limitations, including a lack of drift-awareness, reliance on fixed hyperparameters, vulnerability to data snooping, absence of a principled decay mechanism, and memory inefficiencies. In response, we propose DAO-GP (Drift-Aware Online Gaussian Process), a novel, fully adaptive, hyperparameter-free, decayed, and sparse non-linear regression model. DAO-GP features a built-in drift detection and adaptation mechanism that dynamically adjusts model behavior based on the severity of drift. Extensive empirical evaluations confirm DAO-GP's robustness across stationary conditions, diverse drift types (abrupt, incremental, gradual), and varied data characteristics. Analyses demonstrate its dynamic adaptation, efficient in-memory and decay-based management, and evolving inducing points. Compared with state-of-the-art parametric and non-parametric models, DAO-GP consistently achieves superior or competitive performance, establishing it as a drift-resilient solution for online non-linear regression.

تم اقتراح طريقة جديدة تُدعى DAO-GP (عملية غاوسية على الإنترنت مع الوعي بالانحراف) لمعالجة التحديات التي تطرحها ظاهرة انحراف المفهوم في مجموعات البيانات الواقعية، والتي يمكن أن تؤثر بشكل كبير على دقة التنبؤ. تعزز هذه الطريقة المبتكرة نماذج عملية غاوسية من خلال السماح بإجراء تعديلات ديناميكية على المعلمات الفائقة استجابةً لتطور توزيعات البيانات، مما يحسن أداء النموذج في البيئات عبر الإنترنت.

Se ha propuesto un nuevo método llamado DAO-GP (Drift-Aware Online Gaussian Process) para abordar los desafíos que plantea la deriva de concepto en conjuntos de datos del mundo real, que pueden afectar significativamente la precisión predictiva. Este enfoque innovador mejora los modelos de Proceso Gaussiano al permitir ajustes dinámicos de los hiperparámetros en respuesta a la evolución de las distribuciones de datos, mejorando así el rendimiento del modelo en entornos en línea.

Une nouvelle méthode appelée DAO-GP (Drift-Aware Online Gaussian Process) a été proposée pour relever les défis posés par le concept de dérive dans les ensembles de données du monde réel, qui peuvent avoir un impact significatif sur la précision prédictive. Cette approche innovante améliore les modèles de processus gaussiens en permettant des ajustements dynamiques des hyperparamètres en réponse à l'évolution des distributions de données, améliorant ainsi les performances du modèle dans des contextes en ligne.

A new method called DAO-GP (Drift-Aware Online Gaussian Process) has been proposed to address the challenges posed by concept drift in real-world datasets, which can significantly impact predictive accuracy. This innovative approach enhances Gaussian Process models by allowing dynamic adjustments to hyperparameters in response to evolving data distributions, thereby improving model performance in online settings.

No-Regret Gaussian Process Optimization of Time-Varying Functions

Was this article worth reading? Share it

LucidQuery AI

Hypertune

GPTHumanizer