LLM AutotunerΒΆ

LLM Autotuner Logo

Automated parameter tuning for LLM inference engines (SGLang, vLLM).

Key Features:

  • Multiple Deployment Modes: Docker, Local (direct GPU), Kubernetes/OME

  • Optimization Strategies: Grid search, Random search, Bayesian optimization

  • SLO-Aware Scoring: Exponential penalties for constraint violations

  • GPU Intelligent Scheduling: Per-GPU efficiency metrics and resource pooling

  • Web UI: React frontend with real-time monitoring

  • Agent Assistant: LLM-powered assistant for task management

Indices and tablesΒΆ