Installation Guide

Complete installation instructions for the LLM Autotuner.

System Requirements

  • OS: Linux (Ubuntu 20.04+ recommended)

  • Python: 3.8+

  • GPU: NVIDIA GPU with CUDA support (for inference)

  • Memory: 16GB+ RAM recommended

Manual Install

If you prefer manual installation:

# Clone repository
git clone <repository-url>
cd autotuner

# Create virtual environment
python3 -m venv env
source env/bin/activate

# Install dependencies
pip install -r requirements.txt
pip install genai-bench

# Install frontend dependencies
cd frontend && npm install && cd ..

# Create data directory
mkdir -p ~/.local/share/autotuner

# Start Redis (for background jobs)
docker run -d -p 6379:6379 redis:alpine

Deployment Mode Setup

Local Mode (Direct GPU)

For running inference servers directly on local GPU without Docker:

# Install SGLang
pip install sglang[all]

# Or install vLLM
pip install vllm

OME Mode (Kubernetes)

See Kubernetes Guide for Kubernetes setup.

Configuration

Environment Variables

Create .env file in project root:

# Server ports
SERVER_PORT=8000
FRONTEND_PORT=5173

# Redis
REDIS_HOST=localhost
REDIS_PORT=6379

# Model path (Docker mode)
DOCKER_MODEL_PATH=/mnt/data/models

# Proxy (if needed)
HTTP_PROXY=http://proxy:port
HTTPS_PROXY=http://proxy:port
NO_PROXY=localhost,127.0.0.1

# HuggingFace token (for gated models)
HF_TOKEN=your_token_here

Database

SQLite database is auto-created at:

~/.local/share/autotuner/autotuner.db

Starting Services

# Activate environment
source env/bin/activate

# Start backend + ARQ worker
./scripts/start_dev.sh

# Start frontend (separate terminal)
cd frontend && npm run dev

Default ports:

  • Frontend: http://localhost:5173

  • Backend API: http://localhost:8000

  • API Docs: http://localhost:8000/docs

Verification

# Check backend health
curl http://localhost:8000/api/system/health

# Expected: {"status":"healthy","database":"ok","redis":"ok"}

Troubleshooting

See Troubleshooting for common issues.

Common issues:

  • Redis not running → docker run -d -p 6379:6379 redis:alpine

  • GPU not accessible → Check NVIDIA drivers and Docker runtime

  • Port conflicts → Update ports in .env file