docs/contributing/code-style.md
This page documents code style policies that every Zulip developer should understand. We aim for this document to be short and focused only on details that cannot be easily enforced another way (e.g., through linters, automated tests, or subsystem design that makes classes of mistakes unlikely). This approach minimizes the cognitive load of ensuring a consistent coding style for both contributors and maintainers.
One can summarize Zulip's coding philosophy as a relentless focus on making the codebase easy to understand and difficult to make dangerous mistakes in (see the sections on dangerous constructs at the end of this page). The majority of work in any large software development project is understanding the existing code so one can debug or modify it, and investments in code readability usually end up paying for themselves when someone inevitably needs to debug or improve the code.
When there's something subtle or complex to explain or ensure in the implementation, we try hard to make it clear through a combination of clean and intuitive interfaces, well-named variables and functions, comments/docstrings, and commit messages (roughly in that order of priority -- if you can make something clear with a good interface, that's a lot better than writing a comment explaining how the bad interface works).
After an introduction to our lint tools and test suites, this document outlines some general conventions and practices applicable to all languages used in the codebase, as well as specific guidance on Python and JavaScript and TypeScript. (HTML and CSS are outlined in their own documentation.)
At the end of the document, you can read about dangerous constructs in Django and JavaScript and TypeScript that you should absolutely avoid.
Look at the surrounding code, or a similar part of the project, and try to do the same thing. If you think the other code has actively bad style, fix it (in a separate commit).
When in doubt, ask in #development help.
You can run all of the linters at once:
$ ./tools/lint
Note that that takes a little time. ./tools/lint runs many
lint checks in parallel, including:
To speed things up, you can pass specific files or directories to the linter:
$ ./tools/lint web/src/compose.ts
If you'd like, you can also set up a local Git commit hook that will lint only your changed files each time you commit:
$ ./tools/setup-git-repo
Clear, readable code is important for tests; familiarize yourself with our testing frameworks and testing philosophy so that you can write clean, readable tests. In-test comments about anything subtle that is being verified are appreciated.
You can run all of the tests like this:
$ ./tools/test-all
But consult our documentation on running tests, which covers more targeted approaches to commanding the test-runners.
What follows is language-neutral advice that is beyond the bounds of linters and automated tests.
We have an absolute hard limit on line length only for some files, but we should still avoid extremely long lines. A general guideline is: refactor stuff to get it under 85 characters, unless that makes the code a lot uglier, in which case it's fine to go up to 120 or so.
Remember to tag all user-facing strings for translation, whether the strings are in HTML templates or output by JavaScript/TypeScript that injects or modifies HTML (e.g., error messages).
When writing out state or log files, always pass an absolute path
through zulip_path (found in zproject/computed_settings.py), which
will do the right thing in both development and production.
Please don't put any passwords, secret access keys, etc. inline in the
code. Instead, use the get_secret function or the get_mandatory_secret
function in zproject/config.py to read secrets from /etc/zulip/secrets.conf.
See our docs on dependencies for discussion of rules about integrating third-party projects.
tools/lint --only=ruff,ruff-format --fix. You may find it helpful
to integrate Ruff with
your editor../script.py), so that the
interpreter is implicitly found from the shebang line, rather than
explicitly overridden (python script.py).[x, y] = xs # unnecessary
x, y = xs # better
x % (y,) rather than x % y, to avoid
ambiguity if y happens to be a tuple.Our JavaScript and TypeScript code is formatted with
Prettier. You can ask Prettier to reformat
all code via our linter tool with
tools/lint --only=prettier --fix. You can also integrate it with your
editor.
The best way to build complicated DOM elements is a Handlebars template
like web/templates/message_reactions.hbs. For simpler things you can
use jQuery DOM-building APIs like this:
const $new_tr = $('<tr />').attr('id', object.id);
Attach callback functions to events using jQuery code. For example:
$("body").on("click", ".move_message_button", function (e) {
// message-moving UI logic
}
That approach has multiple benefits:
onclick attribute, this is not
bound to the element like you might thinkDo not use onclick attributes in the HTML.
const and letAlways declare JavaScript variables using const or let rather than
var.
For functions that operate on arrays or JavaScript objects, you should
generally use modern
ECMAScript
primitives such as for … of
loops,
Array.prototype.{entries, every, filter, find, indexOf, map, some},
Object.{assign, entries, keys, values},
spread
syntax,
and so on. Our Babel configuration automatically transpiles and
polyfills these using core-js
when necessary. We used to use the
Underscore library, but that should be
avoided in new code.
See the documentation on HTML and CSS for guidance on conventions in those language.
Look out for Django code like this:
bars = Bar.objects.filter(...)
for bar in bars:
foo = bar.foo
# Make use of foo
...because it equates to:
bars = Bar.objects.filter(...)
for bar in bars:
foo = Foo.objects.get(id=bar.foo.id)
# Make use of foo
...which makes a database query for every Bar. While this may be fast
locally in development, it may be quite slow in production! Instead,
tell Django's QuerySet
API to
prefetch the data in the initial query:
bars = Bar.objects.filter(...).select_related()
for bar in bars:
foo = bar.foo # This doesn't take another query, now!
# Make use of foo
If you can't rewrite it as a single query, that's a sign that something is wrong with the database schema. So don't defer this optimization when performing schema changes, or else you may later find that it's impossible.
UserProfile.objects.get(), Client.objects.get(), etc.)In our Django code, never do direct UserProfile.objects.get(email=foo)
database queries. Instead always use get_user_profile_by_{email,id}.
There are 3 reasons for this:
.select_related() (see above!),
and thus will perform well when one later accesses related models
like the Realm.Similarly we have get_client and access_stream_by_id /
access_stream_by_name functions to fetch those commonly accessed
objects via remote cache.
Don't use Django model objects as keys in sets/dictionaries -- you will get unexpected behavior when dealing with objects obtained from different database queries:
For example, the following will, surprisingly, fail:
# Bad example -- will raise!
obj: UserProfile = get_user_profile_by_id(17)
some_objs = UserProfile.objects.get(id=17)
assert obj in set([some_objs])
You should work with the IDs instead:
obj: UserProfile = get_user_profile_by_id(17)
some_objs = UserProfile.objects.get(id=17)
assert obj.id in set([o.id for o in some_objs])
update_fieldsYou should always pass the update_fields keyword argument to .save()
when modifying an existing Django model object. By default, .save() will
overwrite every value in the column, which results in lots of race
conditions where unrelated changes made by one thread can be
accidentally overwritten by another thread that fetched its UserProfile
object before the first thread wrote out its change.
In most cases, we already have a function in zerver.actions with
a name like do_activate_user that will correctly handle lookups,
caching, and notifying running browsers via the event system about your
change. So please check whether such a function exists before writing
new code to modify a model object, since your new code has a good chance
of getting at least one of these things wrong.
Python allows datetime objects to not have an associated time zone, which can cause time-related bugs that are hard to catch with a test suite, or bugs that only show up during daylight saving time.
Good ways to make time-zone-aware datetimes are below. We import time zone
libraries as from datetime import datetime, timezone and
from django.utils.timezone import now as timezone_now.
Use:
timezone_now() to get a datetime when Django is available, such as
in zerver/.datetime.now(tz=timezone.utc) when Django is not available, such as
for bots and scripts.datetime.fromtimestamp(timestamp, tz=timezone.utc) if creating a
datetime from a timestamp. This is also available as
zerver.lib.timestamp.timestamp_to_datetime.datetime.strptime(date_string, format).replace(tzinfo=timezone.utc) if
creating a datetime from a formatted string that is in UTC.Idioms that result in time-zone-naive datetimes, and should be avoided, are
datetime.now() and datetime.fromtimestamp(timestamp) without a tz
parameter, datetime.utcnow() and datetime.utcfromtimestamp(), and
datetime.strptime(date_string, format) without replacing the tzinfo at
the end.
Additional notes:
time.time() to get timestamps can be cleaner than
dealing with datetimes.for...in statements to traverse arraysThat construct pulls in properties inherited from the prototype chain. Don't use it: [1], [2], [3]