Built for VS Code.
Powered by Local Inference.

The Predic extension doesn't just call an API. It orchestrates a local AI pipeline. It connects seamlessly to a running KoboldCPP instance to serve our specialized GGUF models with zero latency overhead.

Real-time Ghost Text completion
Automatic model switching based on file type
Low RAM usage mode (4GB min)

# Architecture Diagram

VS Code Extension

Listener / UI Layer

Localhost API (Port 5001)

JSON Request/Response

KoboldCPP / Llama.cpp

Predic GGUF Models

Built for VS Code. Powered by Local Inference.

Built for VS Code.
Powered by Local Inference.