Documentation

Master your offline workflow. From GGUF setup to VS Code integration.

Features

Zero data egress. Your code never leaves localhost.

Runs quantized models on consumer CPUs and GPUs.

Analyzes surrounding code for accurate suggestions.

Switch between React and Python models instantly.

Extension Preview

import React from 'react';

const Button = () => {
// Predic Suggestion:
return <button className="bg-blue-500">Click Me</button>;
}

Predic is a frontend extension. You need a backend "Runner" to handle the AI processing.

1. KoboldCPP

Recommended

2. GGUF Model

Required

3. VS Code

v1.85+

Use Chat Box (Left Sidebar) to generate code

Get inline ghost-text completion while typing

Right-click to fix bugs automatically

Highlight code to get AI explanations

Setting	Default	Description
predic.endpoint	http://127.0.0.1:5001	The API URL of your model runner.
predic.maxTokens	128	Max code length per suggestion.

Loading 13B models requires at least 16GB of system RAM. If you experience crashes, try using a Q4_K_S quant or switch to a 7B model.

On standard HDDs, the model might take 1-2 seconds to "wake up" for the first suggestion. SSDs are highly recommended.

August 03, 2025

December 10, 2025

Will add our own language based, finetuned models.