LogPurge: Log Data Purification for Anomaly Detection via Rule-Enhanced Filtering

arXiv — cs.LG•Wednesday, November 19, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

LogPurge introduces a novel framework for log data purification aimed at enhancing anomaly detection through a two
The significance of LogPurge lies in its potential to streamline the log anomaly detection process, reducing reliance on costly human labeling and enhancing the accuracy of identifying system failures and security threats, which is crucial for maintaining service reliability and performance.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Recommended Readings

arXiv — cs.LG19 hours ago

Known Meets Unknown: Mitigating Overconfidence in Open Set Recognition

PositiveArtificial Intelligence

Open Set Recognition (OSR) is a critical area in machine learning that involves not only classifying known categories but also rejecting unknown samples. A significant challenge arises when unknown samples resemble known classes, leading to overconfidence in model predictions and misclassifications. This paper introduces a framework designed to mitigate overconfidence through a two-component system: a perturbation-based uncertainty estimation module and an unknown detection module that employs distinct classifiers.

Read full article

via arXiv — cs.LG

arXiv — cs.CL19 hours ago

SpiderGen: Towards Procedure Generation For Carbon Life Cycle Assessments with Generative AI

PositiveArtificial Intelligence

SpiderGen is a new workflow that utilizes large language models (LLMs) to enhance the process of conducting Life Cycle Assessments (LCAs) for consumer products. These assessments are crucial for understanding the environmental impact of goods, particularly in the context of greenhouse gas (GHG) emissions. SpiderGen integrates traditional LCA methodologies with the advanced reasoning capabilities of LLMs to produce graphical representations known as Product Category Rules Process Flow Graphs (PCR PFGs). The effectiveness of SpiderGen was evaluated against 65 real-world LCA documents.

Read full article

via arXiv — cs.CL

arXiv — cs.LG19 hours ago

Node-Level Uncertainty Estimation in LLM-Generated SQL

PositiveArtificial Intelligence

A new framework for detecting errors in SQL generated by large language models (LLMs) has been introduced, focusing on estimating uncertainty at the node level within the query's abstract syntax tree (AST). The method employs a semantically aware labeling algorithm to assess node correctness and utilizes a classifier to predict error probabilities for each node. This approach allows for precise diagnostics, significantly improving error detection compared to traditional token log-probabilities across various databases and datasets.

Read full article

via arXiv — cs.LG

arXiv — cs.CL19 hours ago

Scaling Textual Gradients via Sampling-Based Momentum

PositiveArtificial Intelligence

The article discusses the challenges and potential of scaling prompt optimization using LLM-provided textual gradients. While this method has proven effective for automatic prompt engineering, issues arise when increasing training data due to context-length limits and diminishing returns from long-context degradation. The authors propose a new approach called Textual Stochastic Gradient Descent with Momentum (TSGD-M), which utilizes momentum sampling to enhance training stability and scalability.

Read full article

via arXiv — cs.CL

arXiv — cs.LG19 hours ago

ReflexGrad: Three-Way Synergistic Architecture for Zero-Shot Generalization in LLM Agents

PositiveArtificial Intelligence

ReflexGrad is a new architecture designed to enhance zero-shot generalization in large language model (LLM) agents. It integrates three mechanisms: hierarchical TODO decomposition for strategic planning, history-aware causal reflection for identifying failure causes, and gradient-based optimization for systematic improvement. This approach allows agents to learn from experiences without needing task-specific training, marking a significant advancement in reinforcement learning and decision-making.

Read full article

via arXiv — cs.LG

arXiv — cs.LG19 hours ago

MalRAG: A Retrieval-Augmented LLM Framework for Open-set Malicious Traffic Identification

PositiveArtificial Intelligence

MalRAG is a novel retrieval-augmented framework designed for the fine-grained identification of open-set malicious traffic in cybersecurity. As cyber threats continuously evolve, the ability to detect both known and new types of malicious traffic is paramount. This framework utilizes a frozen large language model (LLM) to construct a comprehensive traffic knowledge database, employing adaptive retrieval and prompt engineering techniques to enhance identification capabilities.

Read full article

via arXiv — cs.LG

arXiv — cs.LG19 hours ago

Can Machines Think Like Humans? A Behavioral Evaluation of LLM Agents in Dictator Games

NeutralArtificial Intelligence

The study titled 'Can Machines Think Like Humans? A Behavioral Evaluation of LLM Agents in Dictator Games' investigates the prosocial behaviors of Large Language Model (LLM) agents. It examines how different personas influence these behaviors and benchmarks them against human actions. The findings indicate that assigning human-like identities to LLMs does not guarantee human-like decision-making, revealing significant variability in alignment with human behavior across different model architectures.

Read full article

via arXiv — cs.LG

arXiv — cs.CL19 hours ago

Encoding and Understanding Astrophysical Information in Large Language Model-Generated Summaries

NeutralArtificial Intelligence

Large Language Models (LLMs) have shown remarkable capabilities in generalizing across various domains and modalities. This study explores their potential to encode astrophysical information typically derived from scientific measurements. The research focuses on two primary questions: the impact of prompting on the codification of physical quantities by LLMs and the linguistic aspects crucial for encoding the physics represented by these measurements. Sparse autoencoders are utilized to extract interpretable features from the text.

Read full article

via arXiv — cs.CL