Upwork study shows AI agents excel with human partners but fail independently

VentureBeat — AIThursday, November 13, 2025 at 6:30:00 PM
Upwork study shows AI agents excel with human partners but fail independently
  • Upwork's study highlights that AI agents often fail to perform basic tasks alone, yet they show remarkable improvement when paired with human experts, achieving up to a 70% increase in project completion rates. This research is based on an evaluation of over 300 real client projects posted on Upwork's platform.
  • The findings are crucial for understanding the role of AI in the workforce, as they challenge the notion of fully autonomous AI agents and suggest that human collaboration is essential for maximizing productivity.
  • The study emphasizes the importance of human
— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended Readings
DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains
NeutralArtificial Intelligence
The evaluation of discourse-level translation in expert domains is currently inadequate, despite its importance for knowledge dissemination. Existing methods focus mainly on segment-level accuracy and fluency, neglecting discourse coherence and terminological precision. To address this, DiscoX has been introduced as a benchmark for Chinese-English translation, featuring 200 curated texts from various domains, with an average length of over 1700 tokens. Additionally, Metric-S, a new evaluation method, provides detailed assessments and shows strong alignment with human judgments.
Building the Web for Agents: A Declarative Framework for Agent-Web Interaction
PositiveArtificial Intelligence
The article discusses the introduction of VOIX, a declarative framework designed to enhance the interaction between AI agents and web interfaces. This framework allows developers to define actions and states through simple HTML tags, promoting reliable and privacy-preserving capabilities for AI agents. A study involving 16 developers demonstrated that participants could quickly create diverse agent-enabled web applications, highlighting the framework's practicality and effectiveness.