To Think or Not to Think: The Hidden Cost of Meta-Training with Excessive CoT Examples

arXiv — cs.LG•Monday, December 8, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

Recent research highlights the limitations of excessive Chain-of-Thought (CoT) examples in meta-training large language models (LLMs), revealing that while CoT prompting enhances reasoning capabilities, too many examples can degrade performance on novel tasks. The study introduces CoT-Recipe, a method to balance CoT and non-CoT examples, significantly improving accuracy on new tasks by up to 300% even without CoT examples in context.
This development is crucial as it addresses the challenges faced by LLMs in adapting to unfamiliar tasks, ensuring that models can leverage existing knowledge more effectively. By optimizing the training process, the findings may lead to more robust AI systems capable of better reasoning and problem-solving.
The exploration of CoT methodologies reflects a broader trend in AI research focused on enhancing reasoning capabilities across various models, including Vision-Language Models (VLMs) and the application of curriculum techniques. As the field evolves, the balance between structured reasoning and flexibility in learning remains a pivotal discussion, influencing future advancements in AI technologies.

— via World Pulse Now AI Editorial System

Read Original

Was this article worth reading? Share it

LucidQuery AI

Combines diffusion reasoning with autoregressive LLM for advanced AI analysis.

AI & DataView app details

LCW

An invisible AI copilot that helps you ace every coding interview.

AI & DataView app details

Cline

AI-assisted coding with thoughtful guidance and precise control for developers.

Tech & Developer ToolsView app details

Continue Readings

arXiv — cs.CL2 days ago

Are generative AI text annotations systematically biased?

NeutralArtificial Intelligence

A recent study investigates bias in generative AI text annotations, replicating manual annotations from Boukes (2024) using various Generative Large Language Models (GLLMs) including Llama3.1, Llama3.3, GPT4o, and Qwen2.5. The findings indicate that while GLLMs achieve adequate F1 scores, they exhibit systematic bias, aligning more closely with each other than with manual annotations, which leads to different downstream results.

Read full article

via arXiv — cs.CL

arXiv — cs.CV3 days ago

Deep transfer learning for image classification: a survey

NeutralArtificial Intelligence

A comprehensive survey on deep transfer learning for image classification has been published, highlighting the effectiveness of deep neural networks like CNNs and transformers in scenarios where large labeled datasets are unavailable. The survey emphasizes the importance of transfer learning in enhancing performance under such constraints.

Read full article

via arXiv — cs.CV

arXiv — cs.CL3 days ago

Why Chain of Thought Fails in Clinical Text Understanding

NeutralArtificial Intelligence

A systematic study has revealed that chain-of-thought (CoT) prompting, which is often used to enhance reasoning in large language models (LLMs), fails to improve performance in clinical text understanding. The research assessed 95 advanced LLMs across 87 real-world clinical tasks, finding that 86.3% of models experienced performance degradation in CoT settings, particularly with electronic health records that are lengthy and fragmented.

Read full article

via arXiv — cs.CL

arXiv — cs.LG3 days ago

UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs

PositiveArtificial Intelligence

The introduction of UniQL, a unified post-training quantization and low-rank compression framework, addresses the challenges of deploying large language models (LLMs) on mobile platforms, which often face limitations in memory and computational resources. This framework allows for on-device configurable pruning rates, enhancing the adaptability of edge LLMs.

Read full article

via arXiv — cs.LG

arXiv — stat.ML3 days ago

Optimal and Diffusion Transports in Machine Learning

NeutralArtificial Intelligence

A recent survey on optimal and diffusion transports in machine learning highlights the significance of time-evolving probability distributions in various applications, including sampling, neural network optimization, and token distribution analysis in large language models. The study emphasizes the transition from Eulerian to Lagrangian representations, which introduces both challenges and opportunities for crafting effective density evolutions.

Read full article

via arXiv — stat.ML