Three-layer framework for agentic reasoning: foundational, self-evolving, multi-agent
Agentic Reasoning for Large Language Models
Abstract
Agentic reasoning marks a paradigm shift by reframing LLMs as autonomous agents that plan, act, and learn through continual interaction. The paper characterizes environmental dynamics through three layers: foundational agentic reasoning (establishing core single-agent capabilities including planning, tool use, and search in stable environments), self-evolving agentic reasoning (studying how agents refine capabilities through feedback, memory, and adaptation), and collective multi-agent reasoning (extending intelligence to collaborative settings). The survey distinguishes between in-context reasoning (test-time interaction) and post-training reasoning (reinforcement learning optimization).
Key Contributions
- Structured framework organizing agentic reasoning across foundational, self-evolving, and multi-agent layers
- Distinction between in-context reasoning (test-time) and post-training reasoning (RL optimization)
- Comprehensive review spanning science, robotics, healthcare, autonomous research, and mathematics
- Identification of open challenges: personalization, long-horizon interaction, governance
Methodology
Three-dimensional organizational scheme examining environmental dynamics, reasoning scales, and real-world applications across multiple domains.
Results
The survey establishes agentic reasoning as a paradigm shift enabling LLMs to operate effectively in dynamic settings through systematic planning and adaptation mechanisms. Covers applications across science, robotics, healthcare, and mathematics.
Limitations
- Focus primarily on text-based reasoning
- Multi-agent governance and safety remain open problems
- Long-horizon interaction capabilities still developing
Source: Agentic Reasoning for Large Language Models by Tianxin Wei et al.