Text2Touch: LLM-Designed Reward Functions for Tactile In-Hand Rotation

Paper
January 1, 1970
Original Source
Key Contribution

LLMs autonomously design reward functions for in-hand rotation policies on real tactile Allegro Hand with TacTip sensors

Text2Touch: LLM-Designed Reward Functions for Tactile In-Hand Rotation

  • Source: https://arxiv.org/abs/2509.07445
  • Type: paper
  • ArXiv ID: 2509.07445
  • Date Ingested: 2026-04-05T20:00:00Z
  • Tags: tactile-sensing, llm-reward-design, reinforcement-learning, dexterous-manipulation

Key Contribution

LLMs autonomously design reward functions for in-hand rotation policies on real tactile Allegro Hand with TacTip sensors. Demonstrates that language models can bridge the gap between high-level task descriptions and low-level tactile manipulation rewards.

Summary

Text2Touch introduces a framework where large language models (LLMs) automatically generate reward functions for training dexterous manipulation policies. Instead of hand-engineering reward signals — traditionally one of the hardest parts of RL for manipulation — the system uses natural language task descriptions to prompt an LLM to design appropriate reward functions.

Technical Pipeline

  1. Task specification: Natural language description of desired manipulation behavior (e.g., "rotate the object 90 degrees clockwise")
  2. LLM reward design: LLM generates a reward function incorporating tactile sensor readings, joint states, and object pose
  3. Policy training: RL agent trained in simulation using the LLM-designed reward
  4. Sim-to-real transfer: Trained policy deployed on physical Allegro Hand with TacTip sensors

Hardware

  • Allegro Hand: 4-finger dexterous hand (16 DOF)
  • TacTip sensors: Biomimetic optical tactile sensors on fingertips providing contact geometry through internal pin deformation tracking
  • Real-world deployment: Policies trained with LLM rewards successfully transfer to physical hardware

Key Results

  • LLM-designed rewards achieve comparable or superior performance to manually designed rewards
  • Tactile information is critical — LLM naturally incorporates tactile features into reward design
  • Zero-shot reward generation: no iterative tuning needed for new tasks
  • Successfully demonstrates real-world in-hand rotation with tactile feedback

Significance

This work represents a convergence of two major trends: (1) LLMs as automated engineers for robotics, and (2) tactile sensing for dexterous manipulation. By removing the reward engineering bottleneck, it dramatically lowers the barrier to training new manipulation skills. The fact that LLMs naturally leverage tactile signals in reward design suggests they have internalized useful priors about contact-rich manipulation.

Tags

tactile-sensingllm-reward-designreinforcement-learningdexterous-manipulation

Identifiers

Text2Touch: LLM-Designed Reward Functions for Tactile In-Hand Rotation | KB | MenFem