SpellForger: Prompting Custom Spell Properties In-Game using BERT supervised-trained model

arXiv — cs.CL•Friday, November 21, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

SpellForger is a new game that enables players to craft custom spells through natural language prompts, leveraging a BERT model for real
The development of SpellForger signifies a notable advancement in the application of AI in gaming, potentially transforming how players interact with game mechanics and fostering a more engaging and personalized gaming environment.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

Continue Readings

arXiv — cs.LG2 days ago

PersonaDrift: A Benchmark for Temporal Anomaly Detection in Language-Based Dementia Monitoring

NeutralArtificial Intelligence

The paper introduces PersonaDrift, a synthetic benchmark aimed at evaluating machine learning methods for detecting behavioral changes in people living with dementia (PLwD). It simulates 60-day interaction logs based on real PLwD, focusing on user responses to a digital reminder system. The benchmark highlights two significant changes: flattened sentiment and increased repetition in communication, which caregivers have noted as critical indicators of cognitive decline.

Read full article

via arXiv — cs.LG

arXiv — cs.CL2 days ago

Steering Evaluation-Aware Language Models to Act Like They Are Deployed

PositiveArtificial Intelligence

Large language models (LLMs) can detect when they are being evaluated, which may lead to behavior that compromises safety evaluations. This paper introduces a steering vector technique that suppresses evaluation-awareness, allowing LLMs to behave as if they are deployed during assessments. The study involves a two-step training process to develop evaluation-aware behavior and subsequently train the model to use Python type hints effectively in evaluation settings.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

QueryGym: A Toolkit for Reproducible LLM-Based Query Reformulation

PositiveArtificial Intelligence

QueryGym is a new Python toolkit designed for large language model (LLM)-based query reformulation. It aims to provide a unified framework that enhances retrieval effectiveness by allowing consistent implementation, execution, and comparison of various LLM-based methods. The toolkit includes a Python API, a retrieval-agnostic interface for integration with backends like Pyserini and PyTerrier, and a centralized prompt management system.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

Beyond Bias Scores: Unmasking Vacuous Neutrality in Small Language Models

NeutralArtificial Intelligence

The article discusses the rapid adoption of Small Language Models (SLMs) and the ethical implications surrounding their use. It introduces the Vacuous Neutrality Framework (VaNeu), a new evaluation paradigm designed to assess the fairness of SLMs before deployment. The framework evaluates model robustness across various stages, revealing vulnerabilities in models that initially appear unbiased. This study represents the first large-scale audit of SLMs in the 0.5-5B parameter range.

Read full article

via arXiv — cs.CL

arXiv — cs.CV2 days ago

BioBench: A Blueprint to Move Beyond ImageNet for Scientific ML Benchmarks

PositiveArtificial Intelligence

BioBench is introduced as an open ecology vision benchmark that addresses the limitations of ImageNet in predicting performance on scientific imagery. It encompasses 9 application-driven tasks, 4 taxonomic kingdoms, and 6 acquisition modalities, totaling 3.1 million images. The benchmark aims to enhance ecological research by providing a unified platform for evaluating visual representation quality in ecological tasks.

Read full article

via arXiv — cs.CV

arXiv — cs.LG2 days ago

Machine Learning Epidemic Predictions Using Agent-based Wireless Sensor Network Models

PositiveArtificial Intelligence

The study addresses the challenge of insufficient epidemiological data in wireless sensor networks (WSNs) for modeling and predicting the spread of viruses and malware. An agent-based implementation of the SEIRV model was utilized for machine learning predictions, generating synthetic datasets for various algorithms. The results showed promising accuracy in predicting infected and recovered nodes, indicating the potential of machine learning in epidemic forecasting.

Read full article

via arXiv — cs.LG

arXiv — cs.LG3 days ago

Exploration of Summarization by Generative Language Models for Automated Scoring of Long Essays

PositiveArtificial Intelligence

This research investigates the use of generative language models for the automated scoring of long essays, addressing the limitations of BERT and similar models that are restricted to 512 tokens. The study found significant improvements in scoring accuracy, with the Quadratic Weighted Kappa (QWK) score rising from 0.822 to 0.8878 using the Learning Agency Lab Automated Essay Scoring 2.0 dataset.

Read full article

via arXiv — cs.LG