Documentation
Master your offline workflow. From GGUF setup to VS Code integration.
Features
Fully Offline
Zero data egress. Your code never leaves localhost.
GGUF Native
Runs quantized models on consumer CPUs and GPUs.
Context Aware
Analyzes surrounding code for accurate suggestions.
Hot Swappable
Switch between React and Python models instantly.
Extension Preview
import React from 'react';
const Button = () => {
// Predic Suggestion:
return <button className="bg-blue-500">Click Me</button>;
}
const Button = () => {
// Predic Suggestion:
return <button className="bg-blue-500">Click Me</button>;
}
Requirements
Core Engine Needed
Predic is a frontend extension. You need a backend "Runner" to handle the AI processing.
1. KoboldCPP
Recommended
2. GGUF Model
Required
3. VS Code
v1.85+
Installation
1Installation Guide
- Download KoboldCPP (Latest Release)
- Download a GGUF Model (e.g. ReaPredic-7B)
- Get the 'Predic' extension from VS Code Marketplace
2Setup Guide
- Launch KoboldCPP.exe on your machine
- Load your downloaded GGUF model inside KoboldCPP
- Ensure API is running on port 5001 (Default)
- Restart VS Code to activate the extension
3How to Use
Use Chat Box (Left Sidebar) to generate code
Get inline ghost-text completion while typing
Right-click to fix bugs automatically
Highlight code to get AI explanations
Configuration
| Setting | Default | Description |
|---|---|---|
| predic.endpoint | http://127.0.0.1:5001 | The API URL of your model runner. |
| predic.maxTokens | 128 | Max code length per suggestion. |
Known Issues
High RAM Usage
Loading 13B models requires at least 16GB of system RAM. If you experience crashes, try using a Q4_K_S quant or switch to a 7B model.
First Token Latency
On standard HDDs, the model might take 1-2 seconds to "wake up" for the first suggestion. SSDs are highly recommended.
Release Notes
v1.0.0
August 03, 2025- Initial release of Predic Extension.
- Inline codecompletion only.
- Only uses GPT-2.
- Supported by Xenova/transformer.js
v2.0.0 Latest
December 10, 2025- Complete offline code completion.
- Supported by KoboldCPP local inference API.
- Includes chat interface inside VS Code sidebar.
- Supports GGUF quantized models.
v2.1.0 (Planned)
Will add our own language based, finetuned models.