Back to Continue

How Autocomplete Works in Continue

docs/ide-extensions/autocomplete/how-it-works.mdx

1.5.451.5 KB
Original Source

Timing Optimization for Autocomplete

In order to display suggestions quickly, without sending too many requests, we do the following:

  • Debouncing: If you are typing quickly, we won't make a request on each keystroke. Instead, we wait until you have finished.
  • Caching: If your cursor is in a position that we've already generated a completion for, this completion is reused. For example, if you backspace, we'll be able to immediately show the suggestion you saw before.

Context Retrieval from Your Codebase

Continue uses a number of retrieval methods to find relevant snippets from your codebase to include in the prompt.

Filtering and Post-Processing AI Suggestions

Language models aren't perfect, but can be made much closer by adjusting their output. We do extensive post-processing on responses before displaying a suggestion, including:

  • Removing special tokens
  • Stopping early when regenerating code to avoid long, irrelevant output
  • Fixing indentation for proper formatting
  • Occasionally discarding low-quality responses, such as those with excessive repetition

You can learn more about how it works in the Autocomplete deep dive.

<Info> **Looking for AI that predicts your next changes or additions?** Check out [Next Edit](/ide-extensions/autocomplete/next-edit), an experimental feature that proactively suggests code changes before you even start typing, going beyond traditional autocomplete to anticipate entire code modifications. </Info>