AutoTTS: Automated Discovery of Test-Time Scaling Strategies for Large Language Models
Large language models waste computation during inference because researchers manually design test-time scaling strategies. AutoTTS changes this by automatically discovering optimal strategies that improve accuracy while reducing costs.
The Manual Design Problem
Test-time scaling (TTS) improves LLM performance by allocating additional computation during inference. Current approaches require researchers to manually craft reasoning patterns and tune heuristics by intuition. This leaves most of the computation-allocation space unexplored and creates suboptimal strategies.
Manual design forces researchers to guess which combinations of branching, pruning, and stopping will work best. These guesses often miss better solutions that automated discovery can find.
AutoTTS Framework
AutoTTS replaces manual heuristic design with environment-driven automated discovery. Instead of designing individual TTS strategies, researchers design environments where optimal strategies emerge automatically.
The framework formulates width-depth TTS as controller synthesis over pre-collected reasoning trajectories. Controllers decide when to:
- Branch reasoning paths
- Continue current paths
- Probe for quality signals
- Prune weak paths
- Stop computation
This approach evaluates strategies cheaply without repeated LLM calls, making discovery tractable.
Key Technical Innovations
Beta Parameterization: Makes the search space tractable by constraining controller parameters within reasonable bounds.
Fine-grained Execution Traces: Provides detailed feedback that helps agents diagnose why specific TTS programs fail, improving discovery efficiency.
Environment Construction: Creates discovery environments with tractable control spaces and frequent, cheap feedback for TTS search.
Implementation Results
AutoTTS discovered strategies that outperform manually designed baselines on mathematical reasoning benchmarks. The automated approach:
- Improves accuracy-cost tradeoffs over strong manual baselines
- Generalizes to held-out benchmarks and different model scales
- Completes discovery in 160 minutes for $39.90
The discovered strategies work across different problem types and model sizes, demonstrating robust generalization beyond training conditions.
Development Impact
AutoTTS shifts LLM optimization from manual strategy design to automated discovery. This approach:
- Reduces development time from weeks to hours
- Explores strategy spaces humans cannot efficiently search
- Produces strategies that generalize across benchmarks and models
- Costs less than manual experimentation cycles
Next Steps
Implement AutoTTS by setting up the discovery environment for your specific use case. Define your reasoning trajectory collection process, establish probe signals for quality assessment, and configure the controller synthesis parameters. The framework’s modular design allows adaptation to different reasoning tasks beyond mathematical problems.