Artificial Intelligence — Theses

Theses: Artificial Intelligence

Evolving beliefs with evidence. Confidence changes over time as new research arrives.

Thesis 1: Agentic reasoning will consolidate around a standard stack (VLM + tool use + memory + RL) by end of 2027

The three-layer model (foundational, self-evolving, multi-agent) and three-paradigm tool-use framework (prompting, SFT, RL) are already becoming canonical. Protocol standardization (ACP, MCP, A2A) is moving multi-agent from research demos to interoperable infrastructure. The convergence is too coordinated to reverse.

Confidence: 7/10 Supporting evidence:

  • Three major consolidation papers in Q1 2026 define shared vocabulary and frameworks Evidence: strong (Agentic Reasoning, Autonomous Agents, Tool Use)
  • ACP, MCP, A2A protocol standardization signals infrastructure maturity Evidence: moderate (Frontier)
  • Agent memory formalized as write-manage-read loop with 5 mechanism families Evidence: strong (Memory Survey)
  • 60-benchmark taxonomy and three-tier evaluation framework consolidating Evidence: strong (Autonomous Agents)

Challenging evidence:

  • Long-horizon interaction (multi-step plans spanning hours/days) still unsolved — could fragment approaches
  • Benchmark-to-deployment gap suggests current frameworks are necessary but insufficient
  • Multi-agent governance unsolved — may force divergent stacks per use case

Evolution:

  • Apr 5, 2026 — Initial thesis at 7/10. Three survey papers in one quarter is unusually coordinated convergence, but the gap between benchmarks and deployment keeps confidence below 8.

Depends on: agentic-reasoning, llm-tool-use, agent-memory-architectures, reinforcement-learning-for-agents Would change if: A fundamentally different architecture (not LLM-based) achieves superior agent performance, or protocol fragmentation prevents interoperability by end of 2027.


Thesis 2: Mechanistic interpretability will fail to keep pace with model capabilities, creating a widening safety gap

Despite being named a 2026 breakthrough technology, the interpretability toolchain is reactive. Circuit tracing works on current models but scaling to trillion-parameter multimodal systems is an open problem. Capabilities advance quarterly; interpretability advances yearly.

Confidence: 8/10 Supporting evidence:

  • 40 researchers from major labs warn they may be losing ability to understand advanced models Evidence: strong (Mech Interp 2026)
  • Scaling circuit tracing to trillion-parameter models is listed as an open problem Evidence: strong (Frontier)
  • Tracing circuits in multi-modal models (vision + language) remains unsolved Evidence: moderate (Anthropic Circuit Tracing)
  • 6 alignment failure modes documented, Alignment Trilemma shows no single approach guarantees safety Evidence: strong (Safety 2026)

Challenging evidence:

  • Anthropic's Transformer Circuits Thread (2021-2026) has produced qualitative leaps, not just incremental progress
  • Circuit tracing tools now open-sourced, lowering the barrier for broader research community
  • Chain-of-thought monitoring has already caught misbehavior — practical tools exist even if incomplete

Evolution:

  • Apr 5, 2026 — Initial thesis at 8/10. The 40-researcher warning is the strongest signal. The open-sourcing of tools is encouraging but the gap between "can trace a reasoning path" and "can guarantee alignment" is enormous.

Depends on: mechanistic-interpretability, circuit-tracing, agent-safety-alignment Would change if: Automated interpretability tools achieve real-time scaling to frontier models, or a formal verification method for neural networks emerges.


Thesis 3: Evolutionary code generation (AlphaEvolve pattern) will become a standard optimization tool in every major tech company by 2028

AlphaEvolve beat a 57-year-old algorithm (Strassen) and recovered 0.7% of Google's global compute through evolutionary task scheduling. The move to semantic evolution (Gemini 2.5 Pro rewriting logic, not just parameters) is a qualitative shift. OpenEvolve open-sourcing democratizes access.

Confidence: 6/10 Supporting evidence:

  • 0.7% Google global compute recovery validates commercial viability at scale Evidence: strong (AlphaEvolve)
  • Beating Strassen's 1969 algorithm demonstrates ceiling-breaking capability Evidence: strong (AlphaEvolve)
  • OpenEvolve open-sourcing broadens access beyond Google Evidence: moderate (Frontier)
  • Semantic evolution (rewriting logic, not parameters) is a qualitative shift over prior approaches Evidence: strong (AlphaEvolve)

Challenging evidence:

  • Only works for problems with automated evaluators — limits applicability
  • Cannot yet discover fundamentally new paradigms, only optimizes within known frameworks
  • Interpretability of discovered algorithms is poor — enterprises may resist opaque optimizations
  • Single source (Google DeepMind) — no independent replication yet

Evolution:

  • Apr 5, 2026 — Initial thesis at 6/10. The production deployment is compelling but the automated-evaluator constraint limits how many domains this applies to. "Every major tech company" is ambitious — 2028 may be too soon for non-Google adoption.

Depends on: evolutionary-algorithm-discovery Would change if: OpenEvolve produces comparable results outside Google's infrastructure, or if the automated-evaluator constraint proves intractable for most enterprise use cases.

Theses — Artificial Intelligence | KB | MenFem