MedGEN-Bench: Contextually entangled benchmark for open-ended multimodal medical generation

Recent advancements in artificial intelligence (AI) and high-throughput testing have unveiled the stability limits of organic redox flow batteries, showcasing the potential of these technologies to enhance scientific research and innovation.

Read full article

via Phys.org — AI & Machine Learning

WIRED — AI (Latest)a day ago

AI’s Hacking Skills Are Approaching an ‘Inflection Point’

NeutralArtificial Intelligence

AI models are increasingly proficient at identifying software vulnerabilities, prompting experts to suggest that the tech industry must reconsider its software development practices. This advancement indicates a significant shift in the capabilities of AI technologies, particularly in cybersecurity.

Read full article

via WIRED — AI (Latest)

arXiv — cs.CL2 days ago

Cross-Cultural Expert-Level Art Critique Evaluation with Vision-Language Models

NeutralArtificial Intelligence

A new evaluation framework for assessing the cultural interpretation capabilities of Vision-Language Models (VLMs) has been introduced, focusing on cross-cultural art critique. This tri-tier framework includes automated metrics, rubric-based scoring, and calibration against human ratings, revealing a 5.2% reduction in mean absolute error in cultural understanding assessments.

Read full article

via arXiv — cs.CL

arXiv — cs.CV2 days ago

A Highly Efficient Diversity-based Input Selection for DNN Improvement Using VLMs

PositiveArtificial Intelligence

A recent study has introduced Concept-Based Diversity (CBD), a highly efficient metric for image inputs that utilizes Vision-Language Models (VLMs) to enhance the performance of Deep Neural Networks (DNNs) through improved input selection. This approach addresses the computational intensity and scalability issues associated with traditional diversity-based selection methods.

Read full article

via arXiv — cs.CV

arXiv — cs.CL2 days ago

Explaining Generalization of AI-Generated Text Detectors Through Linguistic Analysis

NeutralArtificial Intelligence

A recent study published on arXiv investigates the generalization capabilities of AI-generated text detectors, revealing that while these detectors perform well on in-domain benchmarks, they often fail to generalize across various generation conditions, such as unseen prompts and different model families. The research employs a comprehensive benchmark involving multiple prompting strategies and large language models to analyze performance variance through linguistic features.

Read full article

via arXiv — cs.CL

arXiv — cs.CL2 days ago

Principled Design of Interpretable Automated Scoring for Large-Scale Educational Assessments

PositiveArtificial Intelligence

A recent study has introduced a principled design for interpretable automated scoring systems aimed at large-scale educational assessments, addressing the growing demand for transparency in AI-driven evaluations. The proposed framework, AnalyticScore, emphasizes four principles of interpretability: Faithfulness, Groundedness, Traceability, and Interchangeability (FGTI).

Read full article

via arXiv — cs.CL

arXiv — cs.CV2 days ago

Semantic Misalignment in Vision-Language Models under Perceptual Degradation

NeutralArtificial Intelligence

Recent research has highlighted significant semantic misalignment in Vision-Language Models (VLMs) when subjected to perceptual degradation, particularly through controlled visual perception challenges using the Cityscapes dataset. This study reveals that while traditional segmentation metrics show only moderate declines, VLMs exhibit severe failures in downstream tasks, including hallucinations and inconsistent safety judgments.

Read full article

via arXiv — cs.CV

arXiv — cs.CV2 days ago

CoMa: Contextual Massing Generation with Vision-Language Models

PositiveArtificial Intelligence

The CoMa project has introduced an innovative automated framework for generating building massing, addressing the complexities of architectural design by utilizing functional requirements and site context. This framework is supported by the newly developed CoMa-20K dataset, which includes detailed geometries and contextual data.

Read full article

via arXiv — cs.CV

Ready to build your own newsroom?

Subscribe to unlock a personalised feed, podcasts, newsletters, and notifications tailored to the topics you actually care about