packages/kilo-vscode/docs/mercury-next-edit-testing.html
A walk-through guide for Kilo Code reviewers to validate the new Next Edit Suggestion integration powered by Mercury Edit 2 from Inception Labs.
Mercury Edit 2 is a code-edit model from Inception Labs. Unlike FIM completion, it predicts the user's next multi-line edit given the current file, cursor position, and recent edit history. It typically responds in under 250 ms.
This PR adds Mercury Next Edit as a new, opt-in autocomplete option in Kilo Code. It lives alongside everything that's already shipping — Codestral FIM and Mercury Edit 2 via the Kilo gateway are bit-for-bit unchanged. Selecting Mercury Next Edit (Inception) from the model dropdown switches to a separate render pipeline:
You'll need to pull this PR's branch and run the extension in a development VSCode window ("Extension Development Host"). The whole loop is about 3 minutes once you have the prerequisites.
engines.vscode)brew install bun or bun.shgh) — optional but makes the PR checkout one commandFrom an empty directory:
gh repo clone Kilo-Org/kilocode
cd kilocode
gh pr checkout 10536
Or without gh:
git clone https://github.com/Kilo-Org/kilocode.git
cd kilocode
git fetch origin pull/10536/head:mercury-next-edit-integration
git checkout mercury-next-edit-integration
bun install
(First install pulls the full monorepo — takes 30–60 seconds.)
cd packages/kilo-vscode
bun run watch
Leave that terminal running. It rebuilds the extension on every save and runs the TypeScript compiler in watch mode.
From a separate terminal (or your IDE launcher):
code /path/to/kilocode
Inside that VSCode window, press F5 (or Run → Start Debugging ). A second VSCode window opens, titled [Extension Development Host]. That window has this PR's build of the kilocode extension loaded.
Pre-push
turbo typecheckmay fail on packages that need Java (JetBrains plugin). That's environment, not code — not relevant to the NES feature. Use--no-verifyon any local pushes if you hit it.
In the Dev Host window: File → Open Folder… → choose packages/kilo-vscode/docs/nes-examples/ inside this same repo. That gives you the 20 self-contained test files described below.
Settings (Cmd+,) in the Dev Host, search kilo-code.new.autocomplete:
model → Mercury Next Edit (Inception) — not "Mercury Edit 2", which is the classic FIM-via-gateway optionnextEdit.apiKey → paste your sk_… Inception API key (or set INCEPTION_API_KEY env before launching)enableAutoTrigger → ✓ (already the default)In the Dev Host: View → Output → in the dropdown, select "Kilo Code · Next Edit". Every request, response, and render decision is logged here with timestamps. Keep this panel visible while testing — it's the single best diagnostic.
You're set. Skip to the test cases below.
In VSCode Settings (Cmd+,), search kilo-code.new.autocomplete:
| Setting | Value |
|---|---|
model | Mercury Next Edit (Inception) — not "Mercury Edit 2", which is the original FIM-via-gateway option |
nextEdit.apiKey | your Inception API key (sk_...); also accepts INCEPTION_API_KEY env var |
enableAutoTrigger | ✓ (default) |
nextEdit.baseUrl | (optional) override the API base, defaults to https://api.inceptionlabs.ai/v1 |
nextEdit.debug | (optional) mirror diagnostic logs to DevTools console |
To watch the pipeline live: View → Output in the Dev Host, choose the "Kilo Code · Next Edit" channel.
The AutocompleteServiceManager instantiates both providers up front. Provider registration with vscode.languages.registerInlineCompletionItemProvider is driven by the configured model:
inception/mercury-next-edit → NES provider (this PR's new pipeline)The NES provider, per keystroke:
[cursor − 5, cursor + 10] + 3–5 recently-viewed-snippet ranges (from the shared RecentlyVisitedRangesService) + the last 5 debounced unidiffs (from a new per-file EditHistoryTracker).role: "user" message to POST /v1/edit/completions with max_tokens: 512.InlineCompletionItem; off-cursor diff → NextEditSuggestionManager with a decoration + Tab/Esc keybinding gated on a context flag (kilo-code.nextEdit.hasPendingSuggestion).Each test below is a self-contained file at packages/kilo-vscode/docs/nes-examples/ in this repo. Open that folder in the Extension Development Host (step 5 above), then work through the cases. Place your cursor where indicated, wait ~300 ms idle, and observe.
Tip: keep this page open in a separate window from the Dev Host — the descriptions below would otherwise leak into Mercury's prompt context and bias the test.
Render-path legend:
same-line ghost appears as inline ghost text at the cursor (Tab accepts) · off-cursor decoration renders away from the cursor; first Tab jumps, second Tab applies · suppressed negative case — nothing should render
def factorial(n):
if n <= 1:
return 1
Cursor
The empty indented line at the end of factorial (column 4).
Expected
Ghost text proposing the recursive case (e.g. return n * factorial(n - 1)). Tab accepts.
COLOR_RED = "#ff0000"
COLOR_GREEN = "#00ff00"
COLOR_BLUE =
Cursor
End of line 3 (right after =).
Expected
Ghost text appending a hex color like "#0000ff".
def calculate_total(items):
total = 0
for item in items:
total += item.price
return tot
Cursor
End of file (after return tot).
Expected
Ghost text completing the identifier (likely al → total).
def calculate_total(items):
total = 0
for item in items:
return total
Cursor
The empty indented line inside the for loop (column 8).
Expected
Ghost text proposing the accumulator update.
class Stack:
def __init__ (self):
self.items = []
def push(self, item):
self.items.append(item)
def pop(self):
def peek(self):
return self.items[-1] if self.items else None
Cursor
Empty indented line inside pop (column 8).
Expected
Ghost text proposing a body consistent with the symmetric push.
def compute_user_score(u, w):
base = u * 10
bonus = w * 5
penalty = u - w
return base + bonus - penalty
def compute_user_score(user_id, weight):
base = u * 10
bonus = w * 5
penalty = u - w
return base + bonus - penalty
Cursor
End of the renamed signature line (def compute_user_score(user_id, weight):).
Expected
Strikethrough on the body lines below + ghost showing the renamed body. First Tab jumps, second applies.
def sum_prices(items):
total = 0
for item in items:
return total
Cursor
End of total = 0.
Expected
Decoration on the broken for-loop area showing the corrected body (insertion + replacement combined).
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
result = fib
Cursor
End of the file (after result = fib).
Expected
Ghost text extending the identifier and supplying a call, e.g. onacci(10).
class Queue:
def __init__ (self):
self.items = []
def enqueue(self, item):
self.items.append(item)
def peek(self):
return self.items[0] if self.items else None
def size(self):
return len(self.items)
def is_empty(self):
return not self.items
def dequeue(self):
def clear(self):
self.items.clear()
Cursor
Empty indented line inside dequeue (column 8).
Expected
Ghost text proposing a FIFO pop, e.g. return self.items.pop(0).
def multiply(a: int, b: int) -> int:
return a * b
def subtract(a: int, b: int) -> int:
return a - b
def add(a, b):
return a + b
Cursor
End of def add(a, b): (the only un-annotated function).
Expected
Strikethrough on the signature line + ghost showing the typed version (def add(a: int, b: int) -> int:). May render same-line or off-cursor depending on where on the line you clicked.
import datetime
def parse_iso_datetime(s):
"""Parse an ISO 8601 datetime string into a datetime.datetime."""
return datetime.datetime.fromisoformat(s)
def parse_iso_date(s):
return datetime.date.fromisoformat(s)
Cursor
Empty indented line under def parse_iso_date(s): (column 4).
Expected
Ghost text inserting a one-line docstring matching the sibling's style.
def add(a: int, b: int) -> int:
"""Return the sum of two integers."""
return a + b
def multiply(a: int, b: int) -> int:
"""Return the product of two integers."""
return a * b
def subtract(a: int, b: int) -> int:
"""Return a minus b."""
return a - b
Cursor
End of return a + b.
Expected
Nothing. The code is already correct — either Mercury returns an identical reply or our suppression branch drops the proposal. Channel should show "no-op" or skip lines, never a render.
Failure mode
Any visible suggestion that just replays the existing code is a false positive worth reporting.
interface User {
id: number;
name: string;
active: boolean;
}
function getActiveUserNames(users: User[]): string[] {
return users
}
const sample: User[] = [
{ id: 1, name: "ada", active: true },
{ id: 2, name: "lin", active: false },
{ id: 3, name: "rin", active: true },
];
console.log(getActiveUserNames(sample));
Cursor
End of return users inside getActiveUserNames.
Expected
Ghost text completing the chain, e.g. .filter(u => u.active).map(u => u.name).
function double(x: number): number {
return x * 2;
}
function add(a, b) {
return a + b;
}
function negate(x: number): number {
return -x;
}
function main(): void {
console.log(double(3));
console.log(add(2, 4));
console.log(negate(7));
}
main();
Cursor
End of file (after main();).
Expected
Decoration on the add(a, b) signature proposing the typed version.
declare const React: {
useState: <T>(initial: T) => [T, (next: T) => void];
};
function Counter(): JSX.Element {
const [count, setCount] = React.useState<number>(0);
function handleClick() {
}
return (
<div>
<p>Count: {count}</p>
<button onClick={handleClick}>Increment</button>
</div>
);
}
export default Counter;
Cursor
Empty indented line inside handleClick (column 4).
Expected
Ghost text incrementing count via setCount.
package main
import (
"fmt"
"os"
)
func loadConfig(path string) ([]byte, error) {
data, err := os.ReadFile(path)
return data, nil
}
func main() {
cfg, err := loadConfig("config.json")
if err != nil {
fmt.Println("error:", err)
return
}
fmt.Println(string(cfg))
}
Cursor
Empty line right after data, err := os.ReadFile(path).
Expected
Ghost text proposing the canonical if err != nil { return nil, err }.
package main
import "fmt"
type Rectangle struct {
Width float64
Height float64
}
func (r Rectangle) Perimeter() float64 {
return 2 * (r.Width + r.Height)
}
func (r Rectangle) Area() float64 {
}
func main() {
r := Rectangle{Width: 3, Height: 4}
fmt.Println("perimeter:", r.Perimeter())
fmt.Println("area:", r.Area())
}
Cursor
Empty indented line inside Area().
Expected
Ghost text computing area from Width and Height.
package main
import "fmt"
func main() {
ch := make(chan int)
go func() {
}()
for v := range ch {
fmt.Println("got:", v)
}
}
Cursor
Empty indented line inside the goroutine.
Expected
Ghost text producing values onto the channel and closing it.
enum Shape {
Circle(f64),
Square(f64),
Rectangle(f64, f64),
Triangle(f64, f64),
}
fn area(s: &Shape) -> f64 {
match s {
Shape::Circle(r) => std::f64::consts::PI * r * r,
Shape::Square(side) => side * side,
}
}
fn main() {
let shapes = vec![
Shape::Circle(1.0),
Shape::Rectangle(2.0, 3.0),
Shape::Triangle(4.0, 5.0),
];
for s in &shapes {
println!("area = {}", area(s));
}
}
Cursor
Empty indented line inside the match body, after the Square arm.
Expected
Ghost text adding the missing Rectangle and Triangle arms.
fn parse_int(s: &str) -> Option<i32> {
let n = s.trim()
Some(n * 2)
}
fn main() {
let inputs = [" 21 ", "not-a-number", "10"];
for s in &inputs {
match parse_int(s) {
Some(v) => println!("{} -> {}", s, v),
None => println!("{} -> skipped", s),
}
}
}
Cursor
End of let n = s.trim() (no semicolon yet).
Expected
Ghost text continuing the chain into a parsed i32.
fn longest(a: &str, b: &str) -> &str {
if a.len() >= b.len() {
a
} else {
b
}
}
fn main() {
let s1 = String::from("hello world");
let s2 = String::from("hi");
let out = longest(&s1, &s2);
println!("longest = {}", out);
}
Cursor
End of file.
Expected
Decoration on the fn longest signature proposing lifetime annotations.
async function fetchUser(id) {
try {
} catch (err) {
console.error("fetchUser failed", err);
return null;
}
}
async function main() {
const user = await fetchUser(42);
console.log("user:", user);
}
main();
Cursor
Empty indented line inside the try { block (column 8).
Expected
Ghost text completing the fetch + json parse.
const app = {
get: (_path, _handler) => app,
post: (_path, _handler) => app,
listen: (_port, cb) => cb && cb(),
};
const users = [
{ id: 1, name: "ada" },
{ id: 2, name: "lin" },
];
app.get("/users/:id", (req, res) => {
});
app.post("/users", (req, res) => {
const user = { id: users.length + 1, name: req.body.name };
users.push(user);
res.status(201).json(user);
});
app.listen(3000, () => console.log("listening on :3000"));
Cursor
Empty indented line inside the GET handler (column 4).
Expected
Ghost text proposing a get-by-id (lookup, 404, json response).
SELECT
c.name,
SUM(o.total) AS total_spent
FROM orders o
WHERE o.created_at >= '2026-01-01'
GROUP BY c.name
ORDER BY total_spent DESC
LIMIT 10;
Cursor
End of the line FROM orders o.
Expected
Ghost text completing the JOIN against customers.
SELECT id, email
FROM users
WHERE
ORDER BY last_login_at DESC;
Cursor
End of the bare WHERE line.
Expected
Ghost text proposing a predicate.
# Mercury Edit 2 — Quick Notes
Mercury Edit 2 is a small, fast model trained to predict the user's
next single edit given the current file, cursor position, and recent
edit history. It targets latency under 200 ms on typical files and
returns a unified-diff-like patch scoped to a window around the cursor.
Unlike chat-style completions, the model is biased toward minimal,
local changes — finishing a function body, fixing a typo, propagating
a rename — rather than generating new files from scratch.
Cursor
End of the last sentence.
Expected
Nothing. If Mercury does propose a prose continuation it counts as a soft fail — we don't want a code model writing README content.
If nothing happens when you type, open View → Output → "Kilo Code · Next Edit" and watch the log. The pipeline is verbose enough that 90% of issues are obvious from the first few lines.
| Symptom | Likely cause | Fix |
|---|---|---|
| No log lines at all | Wrong model selected, or Dev Host wasn't reloaded after rebuild | Cmd+R in the Dev Host; confirm model = Mercury Next Edit (Inception) |
skip — no API key resolved | Setting not saved | Re-paste the key in nextEdit.apiKey, press Enter, reload |
<- 401 Unauthorized | Wrong key or wrong tier | Verify the key at platform.inceptionlabs.ai |
<- 400 Bad Request | Prompt-shape regression (we shouldn't ship this, but if it happens during dev) | Capture the response body from the channel and ping the integration owner |
| Suggestion shown for a wrong-looking model | Selecting "Mercury Edit 2" routes through the classic FIM provider, not NES — that's by design (the old behavior is preserved) | Switch to "Mercury Next Edit (Inception)" to use the new pipeline |
Inline ghost text never appears, but logs show RENDER | Another extension (Copilot, Tabnine) is winning the inline-completion race | Temporarily disable conflicting extensions in the Dev Host |
Mercury Next Edit integration for Kilo Code — prepared by the Inception Labs team.
Questions / bug reports: [email protected].