Back to Kilocode

Mercury Next Edit — Testing Playground

packages/kilo-vscode/docs/mercury-next-edit-testing.html

7.3.1819.9 KB
Original Source

Mercury Next Edit — Testing Playground

A walk-through guide for Kilo Code reviewers to validate the new Next Edit Suggestion integration powered by Mercury Edit 2 from Inception Labs.

What is Mercury Next Edit?

Mercury Edit 2 is a code-edit model from Inception Labs. Unlike FIM completion, it predicts the user's next multi-line edit given the current file, cursor position, and recent edit history. It typically responds in under 250 ms.

This PR adds Mercury Next Edit as a new, opt-in autocomplete option in Kilo Code. It lives alongside everything that's already shipping — Codestral FIM and Mercury Edit 2 via the Kilo gateway are bit-for-bit unchanged. Selecting Mercury Next Edit (Inception) from the model dropdown switches to a separate render pipeline:

  • Same-line predictions render as inline ghost text (just like FIM — Tab accepts).
  • Off-cursor predictions render as a decoration (red strikethrough + green ghost annotation) at the predicted edit location. First Tab teleports the cursor there; second Tab applies.
  • After any accept, the integration immediately re-triggers Mercury so the user can walk a refactor with repeated Tab presses ("Tab-Tab-Tab").

Install the PR locally

You'll need to pull this PR's branch and run the extension in a development VSCode window ("Extension Development Host"). The whole loop is about 3 minutes once you have the prerequisites.

Prerequisites

  • VSCode ≥ 1.105.1 (matches kilocode's engines.vscode)
  • Bun ≥ 1.3.13 (the build script checks the version) — install via brew install bun or bun.sh
  • GitHub CLI (gh) — optional but makes the PR checkout one command
  • An Inception API key — create one at platform.inceptionlabs.ai if you don't already have one

1. Check out the PR branch

From an empty directory:

gh repo clone Kilo-Org/kilocode
cd kilocode
gh pr checkout 10536

Or without gh:

git clone https://github.com/Kilo-Org/kilocode.git
cd kilocode
git fetch origin pull/10536/head:mercury-next-edit-integration
git checkout mercury-next-edit-integration

2. Install dependencies

bun install

(First install pulls the full monorepo — takes 30–60 seconds.)

3. Start the dev build

cd packages/kilo-vscode
bun run watch

Leave that terminal running. It rebuilds the extension on every save and runs the TypeScript compiler in watch mode.

4. Open kilocode in VSCode and launch the Extension Development Host

From a separate terminal (or your IDE launcher):

code /path/to/kilocode

Inside that VSCode window, press F5 (or Run → Start Debugging ). A second VSCode window opens, titled [Extension Development Host]. That window has this PR's build of the kilocode extension loaded.

Pre-push turbo typecheck may fail on packages that need Java (JetBrains plugin). That's environment, not code — not relevant to the NES feature. Use --no-verify on any local pushes if you hit it.

5. Open the test playground

In the Dev Host window: File → Open Folder… → choose packages/kilo-vscode/docs/nes-examples/ inside this same repo. That gives you the 20 self-contained test files described below.

6. Configure NES

Settings (Cmd+,) in the Dev Host, search kilo-code.new.autocomplete:

  • modelMercury Next Edit (Inception)not "Mercury Edit 2", which is the classic FIM-via-gateway option
  • nextEdit.apiKey → paste your sk_… Inception API key (or set INCEPTION_API_KEY env before launching)
  • enableAutoTrigger → ✓ (already the default)

7. Watch the pipeline live

In the Dev Host: View → Output → in the dropdown, select "Kilo Code · Next Edit". Every request, response, and render decision is logged here with timestamps. Keep this panel visible while testing — it's the single best diagnostic.

You're set. Skip to the test cases below.

Enabling the feature (settings reference)

In VSCode Settings (Cmd+,), search kilo-code.new.autocomplete:

SettingValue
modelMercury Next Edit (Inception)not "Mercury Edit 2", which is the original FIM-via-gateway option
nextEdit.apiKeyyour Inception API key (sk_...); also accepts INCEPTION_API_KEY env var
enableAutoTrigger✓ (default)
nextEdit.baseUrl(optional) override the API base, defaults to https://api.inceptionlabs.ai/v1
nextEdit.debug(optional) mirror diagnostic logs to DevTools console

To watch the pipeline live: View → Output in the Dev Host, choose the "Kilo Code · Next Edit" channel.

How the integration works

The AutocompleteServiceManager instantiates both providers up front. Provider registration with vscode.languages.registerInlineCompletionItemProvider is driven by the configured model:

  • inception/mercury-next-editNES provider (this PR's new pipeline)
  • anything else → classic FIM provider (unchanged)

The NES provider, per keystroke:

  1. Debounces 250 ms (skipped for explicit invocations).
  2. Builds a Mercury prompt: current file + cursor + an editable region [cursor − 5, cursor + 10] + 3–5 recently-viewed-snippet ranges (from the shared RecentlyVisitedRangesService) + the last 5 debounced unidiffs (from a new per-file EditHistoryTracker).
  3. Sends a single role: "user" message to POST /v1/edit/completions with max_tokens: 512.
  4. Parses the triple-backtick fenced reply, strips Mercury's sentinel tokens, computes the minimal line-diff against the current document.
  5. Branches: same-line diff → InlineCompletionItem; off-cursor diff → NextEditSuggestionManager with a decoration + Tab/Esc keybinding gated on a context flag (kilo-code.nextEdit.hasPendingSuggestion).

Test cases

Each test below is a self-contained file at packages/kilo-vscode/docs/nes-examples/ in this repo. Open that folder in the Extension Development Host (step 5 above), then work through the cases. Place your cursor where indicated, wait ~300 ms idle, and observe.

Tip: keep this page open in a separate window from the Dev Host — the descriptions below would otherwise leak into Mercury's prompt context and bias the test.

Render-path legend:

same-line ghost appears as inline ghost text at the cursor (Tab accepts) · off-cursor decoration renders away from the cursor; first Tab jumps, second Tab applies · suppressed negative case — nothing should render

Python — core tests

01 — Finish a recursive function body same-line

def factorial(n):
    if n <= 1:
        return 1

Cursor

The empty indented line at the end of factorial (column 4).

Expected

Ghost text proposing the recursive case (e.g. return n * factorial(n - 1)). Tab accepts.

02 — Pattern continuation same-line

COLOR_RED = "#ff0000"
COLOR_GREEN = "#00ff00"
COLOR_BLUE =

Cursor

End of line 3 (right after =).

Expected

Ghost text appending a hex color like "#0000ff".

03 — Mid-identifier completion same-line

def calculate_total(items):
    total = 0
    for item in items:
        total += item.price
    return tot

Cursor

End of file (after return tot).

Expected

Ghost text completing the identifier (likely altotal).

04 — Loop body inference same-line

def calculate_total(items):
    total = 0
    for item in items:

    return total

Cursor

The empty indented line inside the for loop (column 8).

Expected

Ghost text proposing the accumulator update.

05 — Sibling method body same-line

class Stack:
    def __init__ (self):
        self.items = []

    def push(self, item):
        self.items.append(item)

    def pop(self):

    def peek(self):
        return self.items[-1] if self.items else None

Cursor

Empty indented line inside pop (column 8).

Expected

Ghost text proposing a body consistent with the symmetric push.

Python — advanced

07 — Multi-line rename refactor off-cursor

def compute_user_score(u, w):
    base = u * 10
    bonus = w * 5
    penalty = u - w
    return base + bonus - penalty

def compute_user_score(user_id, weight):
    base = u * 10
    bonus = w * 5
    penalty = u - w
    return base + bonus - penalty

Cursor

End of the renamed signature line (def compute_user_score(user_id, weight):).

Expected

Strikethrough on the body lines below + ghost showing the renamed body. First Tab jumps, second applies.

08 — Mixed insert + replace off-cursor

def sum_prices(items):
    total = 0
    for item in items:
    return total

Cursor

End of total = 0.

Expected

Decoration on the broken for-loop area showing the corrected body (insertion + replacement combined).

10 — Mid-token completion same-line

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

result = fib

Cursor

End of the file (after result = fib).

Expected

Ghost text extending the identifier and supplying a call, e.g. onacci(10).

11 — Stub method with implemented siblings same-line

class Queue:
    def __init__ (self):
        self.items = []

    def enqueue(self, item):
        self.items.append(item)

    def peek(self):
        return self.items[0] if self.items else None

    def size(self):
        return len(self.items)

    def is_empty(self):
        return not self.items

    def dequeue(self):

    def clear(self):
        self.items.clear()

Cursor

Empty indented line inside dequeue (column 8).

Expected

Ghost text proposing a FIFO pop, e.g. return self.items.pop(0).

12 — Type annotation insertion same-line / off-cursor

def multiply(a: int, b: int) -> int:
    return a * b

def subtract(a: int, b: int) -> int:
    return a - b

def add(a, b):
    return a + b

Cursor

End of def add(a, b): (the only un-annotated function).

Expected

Strikethrough on the signature line + ghost showing the typed version (def add(a: int, b: int) -> int:). May render same-line or off-cursor depending on where on the line you clicked.

13 — Docstring generation same-line

import datetime

def parse_iso_datetime(s):
    """Parse an ISO 8601 datetime string into a datetime.datetime."""
    return datetime.datetime.fromisoformat(s)

def parse_iso_date(s):

    return datetime.date.fromisoformat(s)

Cursor

Empty indented line under def parse_iso_date(s): (column 4).

Expected

Ghost text inserting a one-line docstring matching the sibling's style.

14 — No-op suppression suppressed

def add(a: int, b: int) -> int:
    """Return the sum of two integers."""
    return a + b

def multiply(a: int, b: int) -> int:
    """Return the product of two integers."""
    return a * b

def subtract(a: int, b: int) -> int:
    """Return a minus b."""
    return a - b

Cursor

End of return a + b.

Expected

Nothing. The code is already correct — either Mercury returns an identical reply or our suppression branch drops the proposal. Channel should show "no-op" or skip lines, never a render.

Failure mode

Any visible suggestion that just replays the existing code is a false positive worth reporting.

TypeScript

ts_07 — Array transform completion same-line

interface User {
    id: number;
    name: string;
    active: boolean;
}

function getActiveUserNames(users: User[]): string[] {
    return users
}

const sample: User[] = [
    { id: 1, name: "ada", active: true },
    { id: 2, name: "lin", active: false },
    { id: 3, name: "rin", active: true },
];

console.log(getActiveUserNames(sample));

Cursor

End of return users inside getActiveUserNames.

Expected

Ghost text completing the chain, e.g. .filter(u => u.active).map(u => u.name).

ts_08 — Param type annotations off-cursor

function double(x: number): number {
    return x * 2;
}

function add(a, b) {
    return a + b;
}

function negate(x: number): number {
    return -x;
}

function main(): void {
    console.log(double(3));
    console.log(add(2, 4));
    console.log(negate(7));
}

main();

Cursor

End of file (after main();).

Expected

Decoration on the add(a, b) signature proposing the typed version.

ts_09 — React event handler same-line

declare const React: {
    useState: <T>(initial: T) => [T, (next: T) => void];
};

function Counter(): JSX.Element {
    const [count, setCount] = React.useState<number>(0);

    function handleClick() {

    }

    return (
        <div>
            <p>Count: {count}</p>
            <button onClick={handleClick}>Increment</button>
        </div>
    );
}

export default Counter;

Cursor

Empty indented line inside handleClick (column 4).

Expected

Ghost text incrementing count via setCount.

Go

go_07 — Error handling block same-line

package main

import (
	"fmt"
	"os"
)

func loadConfig(path string) ([]byte, error) {
	data, err := os.ReadFile(path)

	return data, nil
}

func main() {
	cfg, err := loadConfig("config.json")
	if err != nil {
		fmt.Println("error:", err)
		return
	}
	fmt.Println(string(cfg))
}

Cursor

Empty line right after data, err := os.ReadFile(path).

Expected

Ghost text proposing the canonical if err != nil { return nil, err }.

go_08 — Struct method body same-line

package main

import "fmt"

type Rectangle struct {
	Width float64
	Height float64
}

func (r Rectangle) Perimeter() float64 {
	return 2 * (r.Width + r.Height)
}

func (r Rectangle) Area() float64 {

}

func main() {
	r := Rectangle{Width: 3, Height: 4}
	fmt.Println("perimeter:", r.Perimeter())
	fmt.Println("area:", r.Area())
}

Cursor

Empty indented line inside Area().

Expected

Ghost text computing area from Width and Height.

go_09 — Goroutine + channel same-line

package main

import "fmt"

func main() {
	ch := make(chan int)

	go func() {

	}()

	for v := range ch {
		fmt.Println("got:", v)
	}
}

Cursor

Empty indented line inside the goroutine.

Expected

Ghost text producing values onto the channel and closing it.

Rust

rs_07 — Match-arm completion same-line

enum Shape {
    Circle(f64),
    Square(f64),
    Rectangle(f64, f64),
    Triangle(f64, f64),
}

fn area(s: &Shape) -> f64 {
    match s {
        Shape::Circle(r) => std::f64::consts::PI * r * r,
        Shape::Square(side) => side * side,

    }
}

fn main() {
    let shapes = vec![
        Shape::Circle(1.0),
        Shape::Rectangle(2.0, 3.0),
        Shape::Triangle(4.0, 5.0),
    ];
    for s in &shapes {
        println!("area = {}", area(s));
    }
}

Cursor

Empty indented line inside the match body, after the Square arm.

Expected

Ghost text adding the missing Rectangle and Triangle arms.

rs_08 — Result/Option chaining same-line

fn parse_int(s: &str) -> Option<i32> {
    let n = s.trim()
    Some(n * 2)
}

fn main() {
    let inputs = [" 21 ", "not-a-number", "10"];
    for s in &inputs {
        match parse_int(s) {
            Some(v) => println!("{} -> {}", s, v),
            None => println!("{} -> skipped", s),
        }
    }
}

Cursor

End of let n = s.trim() (no semicolon yet).

Expected

Ghost text continuing the chain into a parsed i32.

rs_09 — Lifetime annotations off-cursor

fn longest(a: &str, b: &str) -> &str {
    if a.len() >= b.len() {
        a
    } else {
        b
    }
}

fn main() {
    let s1 = String::from("hello world");
    let s2 = String::from("hi");
    let out = longest(&s1, &s2);
    println!("longest = {}", out);
}

Cursor

End of file.

Expected

Decoration on the fn longest signature proposing lifetime annotations.

JavaScript

js_07 — Async/await fetch same-line

async function fetchUser(id) {
    try {

    } catch (err) {
        console.error("fetchUser failed", err);
        return null;
    }
}

async function main() {
    const user = await fetchUser(42);
    console.log("user:", user);
}

main();

Cursor

Empty indented line inside the try { block (column 8).

Expected

Ghost text completing the fetch + json parse.

js_08 — Express GET handler same-line

const app = {
    get: (_path, _handler) => app,
    post: (_path, _handler) => app,
    listen: (_port, cb) => cb && cb(),
};

const users = [
    { id: 1, name: "ada" },
    { id: 2, name: "lin" },
];

app.get("/users/:id", (req, res) => {

});

app.post("/users", (req, res) => {
    const user = { id: users.length + 1, name: req.body.name };
    users.push(user);
    res.status(201).json(user);
});

app.listen(3000, () => console.log("listening on :3000"));

Cursor

Empty indented line inside the GET handler (column 4).

Expected

Ghost text proposing a get-by-id (lookup, 404, json response).

SQL

sql_07 — Missing JOIN same-line

SELECT
    c.name,
    SUM(o.total) AS total_spent
FROM orders o

WHERE o.created_at >= '2026-01-01'
GROUP BY c.name
ORDER BY total_spent DESC
LIMIT 10;

Cursor

End of the line FROM orders o.

Expected

Ghost text completing the JOIN against customers.

sql_08 — WHERE filter same-line

SELECT id, email
FROM users
WHERE
ORDER BY last_login_at DESC;

Cursor

End of the bare WHERE line.

Expected

Ghost text proposing a predicate.

Markdown (negative case)

md_07 — Prose should stay quiet suppressed

# Mercury Edit 2 — Quick Notes

Mercury Edit 2 is a small, fast model trained to predict the user's
next single edit given the current file, cursor position, and recent
edit history. It targets latency under 200 ms on typical files and
returns a unified-diff-like patch scoped to a window around the cursor.

Unlike chat-style completions, the model is biased toward minimal,
local changes — finishing a function body, fixing a typo, propagating
a rename — rather than generating new files from scratch.

Cursor

End of the last sentence.

Expected

Nothing. If Mercury does propose a prose continuation it counts as a soft fail — we don't want a code model writing README content.

Troubleshooting

If nothing happens when you type, open View → Output → "Kilo Code · Next Edit" and watch the log. The pipeline is verbose enough that 90% of issues are obvious from the first few lines.

SymptomLikely causeFix
No log lines at allWrong model selected, or Dev Host wasn't reloaded after rebuildCmd+R in the Dev Host; confirm model = Mercury Next Edit (Inception)
skip — no API key resolvedSetting not savedRe-paste the key in nextEdit.apiKey, press Enter, reload
<- 401 UnauthorizedWrong key or wrong tierVerify the key at platform.inceptionlabs.ai
<- 400 Bad RequestPrompt-shape regression (we shouldn't ship this, but if it happens during dev)Capture the response body from the channel and ping the integration owner
Suggestion shown for a wrong-looking modelSelecting "Mercury Edit 2" routes through the classic FIM provider, not NES — that's by design (the old behavior is preserved)Switch to "Mercury Next Edit (Inception)" to use the new pipeline
Inline ghost text never appears, but logs show RENDERAnother extension (Copilot, Tabnine) is winning the inline-completion raceTemporarily disable conflicting extensions in the Dev Host

Feedback we'd love

  • Where the prediction was wrong but the UX was correct. Note the file + cursor position + what Mercury proposed. Helps us tune the model.
  • Where the UX got in the way. Tab semantics, decoration appearance, chained-prediction timing, anything that felt clumsy compared to other NES products you've used.
  • Performance regressions in classic FIM autocomplete. The PR is supposed to leave the classic path untouched — if Codestral or Mercury Edit 2 (FIM) feel different in this build, that's a regression we want to know about.
  • Things you tried that aren't in this doc. The 20 tests are a starting point, not a contract. Real codebases will be different.

Mercury Next Edit integration for Kilo Code — prepared by the Inception Labs team.
Questions / bug reports: [email protected].