doc/development/webhooks.md
This page is a developer guide for GitLab webhooks.
Webhooks POST JSON data about an event or change that happened in GitLab to a webhook receiver. Using webhooks, customers are notified when certain changes happen instead of needing to poll the API.
The following is a high-level description of what happens when a webhook is triggered and executed.
sequenceDiagram
Web or API node->>+Database: Fetch data for payload
Database-->>-Web or API node: Build payload
Note over Web or API node,Database: Webhook triggered
Web or API node->>Sidekiq: Queue webhook execution
Sidekiq->>+Remote webhook receiver: POST webhook payload
Remote webhook receiver-)-Database: Save response in WebHookLog
Note over Database,Remote webhook receiver: Webhook executed
Webhooks are resource-oriented. For example, "emoji" webhooks are triggered whenever an emoji is awarded or revoked.
To add webhook support for a resource:
Add a new column to the web_hooks table. The new column must be:
<resource>_eventsfalse.Example of the #change method in a migration:
def change
add_column :web_hooks, :emoji_events, :boolean, null: false, default: false
end
Add support for the new webhook to TriggerableHooks.available_triggers.
Add to the list of triggerable_hooks in ProjectHook, GroupHook, or SystemHook, depending
on whether the webhook should be configurable for projects, groups, or the GitLab instance.
See project, group and system hooks for guidance.
Add frontend support for a new checkbox in the webhook settings form in app/views/shared/web_hooks/_form.html.haml.
Add support for testing the new webhook in TestHooks::ProjectService and TestHooks::SystemService.
TestHooks::GroupService does not need to be updated because it only
executes ProjectService.
Define the webhook payload.
Update GitLab to trigger the webhook.
Add REST API support:
API::ProjectHooks, API::GroupHooks, and API::SystemHooks to support the argument.API::Entities::ProjectHook and API::Entities::GroupHook to support the new field.
(System hook use the generic API::Entities::Hook).Use the following to help you decide whether your webhook should be configurable for project, groups, or a GitLab instance.
Group webhooks are a Premium-licensed feature. All code related to triggering group webhooks,
or building payloads for webhooks that are configurable only for groups, must be in the ee/ directory.
Project and group webhooks are triggered by calling #execute_hooks on a project or group.
The #execute_hooks method is passed:
For example:
project.execute_hooks(payload, :emoji_hooks)
When #execute_hooks is called on a single project or group, the trigger automatically bubbles up to ancestor groups, which also execute. This allows groups to be configured to receive webhooks for events that happen in any of its
subgroups or projects.
When the method is called on:
Building a payload can be expensive because it generally requires that we load more records from the database,
so check #has_active_hooks? on the project before triggering the webhook
(support for a similar method for groups is tracked in issue #517890).
The method returns true if either:
Example:
def execute_emoji_hooks
return unless project.has_active_hooks?(:emoji_hooks)
payload = Gitlab::DataBuilder::Emoji.build(emoji)
project.execute_hooks(payload, :emoji_hooks)
end
When webhooks for projects are triggered, system webhooks configured for the webhook type are executed automatically.
You can also trigger a system hook through SystemHooksService if the webhook is not also configurable for projects.
Example:
SystemHooksService.new.execute_hooks_for(user, :create)
You need to update SystemHooksService to have it build data for the resource.
Webhook payloads must accurately represent the state of data at the time of the event. Care should be taken to avoid problems that arise due to race conditions or concurrent processes changing the state of data, which would lead to inaccurate payloads being sent to webhook receivers.
Some tips to do this:
Both of these points mean the payload must generally be built in-request and not async using Sidekiq.
The exception would be if a payload always contained only immutable data, but this is generally not the case.
A webhook payload is the JSON data POSTed to a webhook receiver.
See existing webhook payloads documented in the webhook events documentation.
Sensitive data should never be included in webhook payloads. This includes secrets and non-public
user emails (private user emails are
redacted automatically
through User#hook_attrs).
Building webhook payloads must be very performant, so every new property added to a webhook payload must be justified against any overheads of retrieving it from the database.
Consider, on balance, if it would be better for a minority of customers to need to fetch some data about an object from the API after receiving a smaller webhook than for all customers to receive the data in the webhook payload. In this scenario, there is a difference in time between when the webhook is built to when the customer retrieves the extra data. The delay can mean the API data and the webhook data can represent different states in time.
Objects should define a #hook_attrs method to return the object attributes for the webhook payload.
The attributes in #hook_attrs must be defined with static keys. The method must return
a specific set of attributes and not just the attributes returned by #attributes or #as_json.
Otherwise, all future attributes of the model will be included in webhook payloads
(see issue 440384).
A module or class in Gitlab::DataBuilder:: should compose the full payload. The full payload usually
includes associated objects.
See payload schema for the structure of the full payload.
For example:
# An object defines #hook_attrs:
class Car < ApplicationRecord
def hook_attrs
{
make: make,
color: color
}
end
end
# A Gitlab::DataBuilder module or class composes the full webhook payload:
module Gitlab
module DataBuilder
module Car
extend self
def build(car, action)
{
object_kind: 'car',
action: action,
object_attributes: car.hook_attrs,
driver: car.driver.hook_attrs # Calling #hook_attrs on associated data
}
end
end
end
end
# Building the payload:
Gitlab::DataBuilder::Car.build(car, 'start')
Historically there has been a lot of inconsistency between the payload schemas of different types of webhooks.
Going forward, unless the payload for a new type of webhook should resemble an existing one for consistency reasons (for example, a webhook for a new issuable), the schema for new webhooks must follow these rules:
"object_kind", the kind of object in snake case. Example: "merge_request"."action", a domain-specific verb of what just happened, using present tense. Examples: "create",
"assign", "update" or "revoke". This helps receivers to identify and handle different
kind of changes that happen to an object when webhooks are triggered at different points in the
object's lifecycle."object_attributes", contains the attributes of the object after the event. These attributes are generated from #hook_attrs."object_attributes"."changes" object.A JSON schema description of the above:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Recommended GitLab webhook payload schema",
"type": "object",
"properties": {
"object_kind": {
"type": "string",
"description": "Kind of object in snake case. Example: merge_request",
"pattern": "^([a-zA-Z]+(_[a-zA-Z]+)*)$"
},
"action": {
"type": "string",
"description": "A domain-specific verb of what just happened to the object, using present tense. Examples: create, revoke",
},
"object_attributes": {
"type": "object",
"description": "Attributes of the object after the event"
},
"changes": {
"type": "object",
"description": "Optional object attributes that were changed during the event",
"patternProperties": {
".+" : {
"type" : "object",
"properties": {
"previous": {
"description": "Value of attribute before the event"
},
"current": {
"description": "Value of attribute after the event"
}
},
"required": ["previous", "current"]
}
}
}
},
"required": ["object_kind", "action", "object_attributes"]
}
Example of a webhook payload for an imaginary Car object
that follows the above payload schema:
{
"object_kind": "car",
"action": "start",
"object_attributes": {
"make": "Toyota",
"color": "grey"
},
"driver": {
"name": "Kaya",
"age": 18
}
}
If your payload should include a list of attribute changes of an object,
add the
ReportableChanges module to the model.
The module collects all changes to attribute values from the time the object is loaded
through to all subsequent saves. This can be useful where there
are multiple save operations on an object in a given request context and
final hooks need access to the cumulative delta, not just that of the
most recent save.
See payload schema for how to include attribute changes in the payload.
Some types of webhooks are triggered millions of times a day on GitLab.com.
Loading additional data for the webhook payload must be performant because we need to build payloads in-request and not on Sidekiq. On GitLab.com, this also means additional data for the payload is loaded from the PostgreSQL primary because webhooks are triggered following a database write.
To minimize data requests when building a webhook payload:
You might need to preload data on a record that has already been loaded.
In this case, you can use ActiveRecord::Associations::Preloader.
If the associated data is only needed to build the webhook payload, only preload this associated
data after the #has_active_hooks? check has passed.
A good working example of this in our codebase is
Gitlab::DataBuilder::Pipeline.
For example:
# Working with an issue that has been loaded
issue = Issue.first
# Imagine we have performed the #has_active_hooks? check and now are building the webhook payload.
# Doing this will perform N+1 database queries:
# issue.notes.map(&:author).map(&:name)
#
# Instead, first preload the associations to avoid the N+1
ActiveRecord::Associations::Preloader.new(records: [issue], associations: { notes: :author }).call;
issue.notes.map(&:author).map(&:name)
We cannot make breaking changes to webhook payloads.
If a webhook receiver might encounter errors due to a change to a webhook payload, the change is a breaking one.
Only additive changes can be made, where new properties are added.
Breaking changes include:
"object_kind" property."action" property.If the value of a property other than "object_kind" or "action" must change, for example due
to feature removal, set the value to null, {}, or [] rather than remove the property.
When writing a unit test for the DataBuilder class, assert that:
QueryRecorder,
You can do this by measuring the number of queries using QueryRecorder and then
comparing against that number in the spec, to ensure that the query count does
not change without our conscious choice. Also see preloading of associated data.Also test the scenarios where the webhook should be triggered (or not triggered), to assert that it does correctly trigger.
You can configure the webhook URL to one provided by https://webhook.site to view the full webhook headers and payloads generated when the webhook is triggered.
In addition to the usual reviewers for code review, changes to webhooks should be reviewed by a backend team member from Import & Integrate.