doc/development/ai_features/_index.md
GitLab Duo features are powered by AI models and integrations. This document provides an overview of developing with AI features in GitLab.
For detailed instructions on setting up GitLab Duo licensing in your development environment, see GitLab Duo licensing for local development.
Here is a list of all of the main steps to go through from a fresh, GDK-less computer to fully working ai-development ready.
Follow the instructions in the GitLab Development Kit to set up GitLab Duo for local development purposes. These instructions describe how to fulfill prerequisites in your local environment and set up core backend components.
If you already have a GDK installed, you still must refer to the GitLab Development Kit instructions to set up DAP with the right environment variables, NGINX, your Anthropic key and more.
gitlab:duo:setup taskRun the gitlab:duo:setup Rake task to seed a test group and a project with GitLab Duo features enabled.
[!note] This task is idempotent and skips reseeding if the
gitlab-duogroup already exists. To force reseeding from this task, setGITLAB_DUO_RESEED=1. For details on the seeds used, see Development seed files.
This ensures that your instance or group has the correct licenses, settings, and feature flags to test GitLab Duo features locally. Below are several options. If you are unsure, use option 1.
[!note] Duo Core add-on is always created when running this script.
Be sure to run the Rake task from the GitLab Rails root directory (typically /path/to/gdk/gitlab), not from the GDK root directory.
GitLab.com mode
GITLAB_SIMULATE_SAAS=1 bundle exec 'rake gitlab:duo:setup'
This:
gitlab-duo, which contains a project called testAlternatively, if you want to add GitLab Duo Pro licenses for the group instead (which only enables a subset of features), you can run:
GITLAB_SIMULATE_SAAS=1 bundle exec 'rake gitlab:duo:setup[duo_pro]'
To test only Duo Core features, you can run:
GITLAB_SIMULATE_SAAS=1 bundle exec 'rake gitlab:duo:setup[duo_core]'
GitLab Self-Managed / Dedicated mode
GITLAB_SIMULATE_SAAS=0 bundle exec 'rake gitlab:duo:setup'
This:
gitlab-duo, which contains a project called testAlternatively, if you want to add GitLab Duo Pro add-on for the instance instead (which only enables a subset of features), you can run:
GITLAB_SIMULATE_SAAS=0 bundle exec 'rake gitlab:duo:setup[duo_pro]'
To test only Duo Core features, you can run:
GITLAB_SIMULATE_SAAS=0 bundle exec 'rake gitlab:duo:setup[duo_core]'
After the script finishes without error, now go to gitlab-duo/test and validate that you can see GitLab Duo Chat. Send a question to Chat and make sure there are no errors.
In most cases, you can simply run the ai-services script to reset your GDK environment variables and it may be enough to fix any errors that occured.
If you get error A9999, it is a catchall error. The biggest offender is not setting up the AI Gateway URL correctly as described in the
AI Gateway installation instructions.
If not, make sure to check the tests are passing in the gitlab-ai-gateway repository with make test and that gdk tail gitlab-ai-gateway returns no error.
A1003 is more around permissions, either an invalid/missing Anthropic token or a misconfiguration of gcloud.
In Agentic Chat, authentication errors may happen and not result in A1003 error. Use gdk tail duo-workflow-service to make sure the workflow service runs without issues. If you see an authentication error, you need to get a new Anthropic key and re-run the ai-setup script
gdk restart rails-background-jobs. If that
doesn't work, try gdk kill and then gdk start.perform_for method in Llm::CompletionWorker class by changing
perform_async to perform_inline.export FETCH_MODEL_SELECTION_DATA_FROM_LOCAL=1 to your env.runit file, so that
your GDK fetches model information from your local AI Gateway rather than cloud-connected AIGW.Apply the following feature flags to any AI feature work:
ai_global_switch) that applies to all other AI features. It's enabled by default.See the feature flag tracker epic for the list of all feature flags and how to use them.
You can push feature flags to AI Gateway. This is helpful to gradually rollout user-facing changes even if the feature resides in AI Gateway. See the following example:
# Push a feature flag state to AI Gateway.
Gitlab::AiGateway.push_feature_flag(:new_prompt_template, user)
Later, you can use the feature flag state in AI Gateway in the following way:
from ai_gateway.feature_flags import is_feature_enabled
# Check if the feature flag "new_prompt_template" is enabled.
if is_feature_enabled('new_prompt_template'):
# Build a prompt from the new prompt template
else:
# Build a prompt from the old prompt template
IMPORTANT: At the cleaning up step, remove the feature flag in AI Gateway repository before removing the flag in GitLab-Rails repository. If you clean up the flag in GitLab-Rails repository at first, the feature flag in AI Gateway will be disabled immediately as it's the default state, hence you might encounter a surprising behavior.
IMPORTANT: Cleaning up the feature flag in AI Gateway will immediately distribute the change to all GitLab instances, including GitLab.com, GitLab Self-Managed, and GitLab Dedicated.
Technical details:
When push_feature_flag runs on an enabled feature flag, the name of the flag is cached in the current context,
which is later attached to the x-gitlab-enabled-feature-flags HTTP header when GitLab-Sidekiq/Rails sends requests to AI Gateway.
When frontend clients (for example, VS Code Extension or LSP) request a User JWT (UJWT) for direct AI Gateway communication, GitLab returns:
x-gitlab-enabled-feature-flags).Frontend clients must regenerate UJWT upon expiration. Backend changes such as feature flag updates through ChatOps render the header values to become stale. These header values are refreshed at the next UJWT generation.
Similarly, we also have push_frontend_feature_flag to push feature flags to frontend.
To connect to the AI provider API using the Abstraction Layer, use an extendable
GraphQL API called aiAction.
The input accepts key/value pairs, where the key is the action that needs to
be performed. We only allow one AI action per mutation request.
Example of a mutation:
mutation {
aiAction(input: {summarizeComments: {resourceId: "gid://gitlab/Issue/52"}}) {
clientMutationId
}
}
As an example, assume we want to build an "explain code" action. To do this, we extend the input with a new key,
explainCode. The mutation would look like this:
mutation {
aiAction(
input: {
explainCode: { resourceId: "gid://gitlab/MergeRequest/52", code: "foo() { console.log() }" }
}
) {
clientMutationId
}
}
The GraphQL API then uses the Anthropic Client to send the response.
The API requests to AI providers are handled in a background job. We therefore do not keep the request alive and the Frontend needs to match the request to the response from the subscription.
[!warning] Determining the right response to a request can cause problems when only
userIdandresourceIdare used. For example, when two AI features use the sameuserIdandresourceIdboth subscriptions will receive the response from each other. To prevent this interference, we introduced theclientSubscriptionId.
To match a response on the aiCompletionResponse subscription, you can provide a clientSubscriptionId to the aiAction mutation.
clientSubscriptionId should be unique per feature and within a page to not interfere with other AI features. We recommend to use a UUID.clientSubscriptionId is provided as part of the aiAction mutation, it will be used for broadcasting the aiCompletionResponse.clientSubscriptionId is not provided, only userId and resourceId are used for the aiCompletionResponse.As an example mutation for summarizing comments, we provide a randomId as part of the mutation:
mutation {
aiAction(
input: {
summarizeComments: { resourceId: "gid://gitlab/Issue/52" }
clientSubscriptionId: "randomId"
}
) {
clientMutationId
}
}
In our component, we then listen on the aiCompletionResponse using the userId, resourceId and clientSubscriptionId ("randomId"):
subscription aiCompletionResponse(
$userId: UserID
$resourceId: AiModelID
$clientSubscriptionId: String
) {
aiCompletionResponse(
userId: $userId
resourceId: $resourceId
clientSubscriptionId: $clientSubscriptionId
) {
content
errors
}
}
The subscription for Chat behaves differently.
To not have many concurrent subscriptions, you should also only subscribe to the subscription once the mutation is sent by using skip().
When working with the aiAction mutation, several ID parameters are used for routing requests and responses correctly. Here's what each parameter does:
gid://gitlab/User/123"9f5dedb3-c58d-46e3-8197-73d653c71e69""gid://gitlab/Issue/164723626""gid://gitlab/Project/278964"The following graph uses VertexAI as an example. You can use different providers.
flowchart TD
A[GitLab frontend] -->B[AiAction GraphQL mutation]
B --> C[Llm::ExecuteMethodService]
C --> D[One of services, for example: Llm::GenerateSummaryService]
D -->|scheduled| E[AI worker:Llm::CompletionWorker]
E -->F[::Gitlab::Llm::Completions::Factory]
F -->G[#96;::Gitlab::Llm::VertexAi::Completions::...#96; class using #96;::Gitlab::Llm::Templates::...#96; class]
G -->|calling| H[Gitlab::Llm::VertexAi::Client]
H --> |response| I[::Gitlab::Llm::GraphqlSubscriptionResponseService]
I --> J[GraphqlTriggers.ai_completion_response]
J --> K[::GitlabSchema.subscriptions.trigger]
We thrive optimizing AI components, such as prompt, input/output parser, tools/function-calling, for each LLM, however, diverging the components for each model could increase the maintenance overhead. Hence, it's generally advised to reuse the existing components for multiple models as long as it doesn't degrade a feature quality. Here are the rules of thumbs:
An example of this case is that we can apply Claude specific CoT optimization to the other models such as Mixtral as long as it doesn't cause a quality degradation.
llm_completion.Refer to the secure coding guidelines for Artificial Intelligence (AI) features.