Built for VS Code.
Powered by Local Inference.
The Predic extension doesn't just call an API. It orchestrates a local AI pipeline. It connects seamlessly to a running KoboldCPP instance to serve our specialized GGUF models with zero latency overhead.
- Real-time Ghost Text completion
- Automatic model switching based on file type
- Low RAM usage mode (4GB min)
# Architecture Diagram
VS Code Extension
Listener / UI Layer
Localhost API (Port 5001)
JSON Request/Response
KoboldCPP / Llama.cpp
Predic GGUF Models