ISS-Geo142: A Benchmark for Geolocating Astronaut Photography from the International Space Station

arXiv — cs.CVMonday, November 24, 2025 at 5:00:00 AM
  • The introduction of ISS
  • The development of ISS
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps
Continue Readings
Parrot: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs
NeutralArtificial Intelligence
The study introduces PARROT, a framework designed to assess the accuracy degradation in large language models (LLMs) under social pressure, particularly focusing on the phenomenon of sycophancy. By comparing neutral and authoritatively false responses, PARROT aims to quantify confidence shifts and classify various failure modes across 22 models evaluated with 1,302 questions across 13 domains.
Large Language Models for Sentiment Analysis to Detect Social Challenges: A Use Case with South African Languages
PositiveArtificial Intelligence
Recent research has explored the application of large language models (LLMs) for sentiment analysis in South African languages, focusing on their ability to detect social challenges through social media posts. The study specifically evaluates the zero-shot performance of models like GPT-3.5, GPT-4, LlaMa 2, PaLM 2, and Dolly 2 in analyzing sentiment polarities across topics in English, Sepedi, and Setswana.
Non-Parametric Probabilistic Robustness: A Conservative Metric with Optimized Perturbation Distributions
PositiveArtificial Intelligence
A new approach to probabilistic robustness in deep learning, termed non-parametric probabilistic robustness (NPPR), has been proposed, which learns optimized perturbation distributions directly from data rather than relying on fixed distributions. This method aims to enhance the evaluation of model robustness under distributional uncertainty, addressing a significant limitation in existing probabilistic robustness frameworks.
Evaluating Large Language Models for Diacritic Restoration in Romanian Texts: A Comparative Study
PositiveArtificial Intelligence
A recent study evaluated the performance of various large language models (LLMs) in restoring diacritics in Romanian texts, highlighting the importance of automatic diacritic restoration for effective text processing in languages rich in diacritical marks. Models tested included OpenAI's GPT-3.5, GPT-4, and Google's Gemini 1.0 Pro, among others, with GPT-4o achieving notable accuracy in diacritic restoration.