docs/subsystems/notifications.md
This is a design document aiming to provide context for developers working on Zulip's email notifications and mobile push notifications code paths. We recommend first becoming familiar with sending messages; this document expands on the details of the email/mobile push notifications code path.
Here we name a few corner cases worth understanding in designing this sort of notifications system:
As a reminder, the relevant part of the flow for sending messages is as follows:
do_send_messages is the synchronous message-sending code path,
and passing the following data in its send_event_on_commit call:
UserMessage
table's flags structure, which is in turn passed into
send_event_on_commit for each user receiving the message.online_push_user_ids and stream_notify_user_ids, are included
in the main event dictionary.presence_idle_user_ids set, containing the subset of
recipient users who can potentially receive notifications, but have not
interacted with a Zulip client in the last few minutes. (Users who
have generally will not receive a notification unless the
enable_online_push_notifications flag is enabled). This data
structure ignores users for whom the message is not notifiable,
which is important to avoid this being thousands of user_ids for
messages to large channels with few currently active users.missedmessage_mobile_notifications and/or missedmessage_emails
queues. This important message-processing logic has notable extra
logic not present when processing normal events, both for details
like splicing flags to customize event payloads per-user, as well.
presence_idle_user_ids are always considered idle:
the variable name means "users who are idle because of
presence". This is how we solve the idle desktop problem; users
with an idle desktop are treated the same as users who aren't
logged in for this check.presence_idle_user_ids (because it takes a
few minutes of being idle for Zulip
clients to declare to the server that the user is actually idle),
and so without an additional mechanism, messages sent shortly after
a user leaves would never trigger a notification (!).receiver_is_off_zulip returns True, which checks whether the user has any
current events system clients registered to receive message
events. This check is done immediately (handling soft disconnects,
for example, where the user closes their last Zulip tab and we get
the DELETE /events/{queue_id} request).receiver_is_off_zulip check is effectively repeated when
event queues are garbage-collected (in missedmessage_hook) by
looking for whether the queue being garbage-collected was the only
one; this second check solves the hard disconnect problem, resulting in
notifications for these hard-disconnect cases usually coming 10
minutes late.zerver/lib/notification_data.py class methods. The module has
unit tests for all possible situations in
test_notification_data.py.maybe_enqueue_notifications_for_message_update for triggering
notifications in cases like a mention added during message
editing.test_message_edit_notifications.py covers all the cases around
editing a message to add/remove a mention.web/src/notifications.js) inspecting the flags fields that
were spliced into message events by the Tornado system, as well as
the user's notification settings.zerver/lib/email_notifications.py) or mobile
(zerver/lib/push_notifications.py) notification. We'll detail
this process in more detail for each system below, but it's
important to know that it's normal for a message to sit in these
queues for minutes (and in the future, possibly hours).MissedMessageWorker,
takes care to wait for 2 minutes (hopefully in the future this will be a
configuration setting) and starts a thread to batch together multiple
messages into a single email. These features are unnecessary
for mobile push notifications, because we can live-update those
details with a future notification, whereas emails cannot be readily
updated once sent. Zulip's email notifications are styled similarly
to GitHub's email notifications, with a clean, simple design that
makes replying from an email client possible (using the incoming
email integration).PushNotificationsWorker, is a simple wrapper around the
push_notifications.py code that actually sends the
notification. This logic is somewhat complicated by having to track
the number of unread push notifications to display on the mobile
apps' badges, as well as using the mobile push notifications
service for self-hosted
systems.The following important constraints are worth understanding about the structure of the system, when thinking about changes to it:
send_event_on_commit() pushing large amounts of
per-user data to Tornado via RabbitMQ for scalability reasons.do_send_messages or the
queue processor logic. (For example, this means presence data
should be checked in either do_send_messages or the queue
processors, not in Tornado).