.gemini/styleguide.md
These guidelines are meant to help focus design discussions and help us create delightful developer experiences.
These are meant as guidelines, not rules: each decision should be debated in its own unique context.
Some text remixed from external references:
When developing APIs, start by designing end-to-end workflows, and only sketch out specific function/class signatures at the end.
It's okay to say no: just because someone asks for a feature doesn't mean we should do it. Every feature has a cost that goes beyond the initial CL: maintenance cost, documentation cost, and cognitive cost for our users (a sprawling API surface is a major usability issue).
In particular, in the Keras API, every new feature has to be maintained in perpetuity.
As such, our criteria for adding a new feature in the API is the following:
In addition, when saying yes to a request for supporting a new use case, remember that literally adding what the user/team requested is often not the optimal choice. Users are focused on their own specific use case, and we must counter this with a holistic and principled vision of the whole project (see: designing end-to-end workflows, not atomic functions/classes). Often, the right answer is to extend an existing feature. Find the natural place to integrate the new feature in existing APIs.
Always seek to minimize the cognitive load imposed on our users in the course of using our APIs.
At a high level:
Here are a few practical rules:
use_locking in an optimizer should be avoided. If an argument requires users to understand the implementation (not just what the code is supposed to implement, like SGD in this case), then the argument should not be included in the public API. An API is all about the problem it solves, not about how the code works in the background.Layer). Definitely avoid having more than 2 or 3 mental models underlying the workflows you design. Likewise, avoid having concepts that are mostly overlapping but subtly different, since the difference will be difficult to convey clearly and will confuse our users (like, say, Network and Model -- this is why we don't export Network as a public API).TensorShape, which is also breaking established conventions of scientific Python). When using enums, make sure that their values are strings, so as to make it possible for users to pass plain strings (example: data_format="channels_last", padding="valid").MyLayer(hyperparameter_dict), instead use MyLayer(units, activation=None, ...).In particular, naming is important and difficult:
OverlyLongAndSpecificNamingPatterns. If you find yourself with argument names with involve more than 3 subparts (e.g. "squared_operator_norm"), reconsider. Argument names should be intuitive and easy to remember.x, variable, parameter).dim what is called axis in other places, don't call ndims what is called ndim elsewhere) and consistency with established conventions for the problem domain (terms of art). Before settling on a name, make sure to look up existing names used by domain experts (or other APIs). In our case, argument names should be consistent with the broader scientific Python conventions, in particular NumPy.Note that Keras uses the following naming rules:
num_* for counters, though omitting an explicit counter is nicer when there is no ambiguity (e.g. units, epochs, filters).ndim. A specific dimension index is an axis. The number of dimensions in a linear projection (or similar) is units.Normalization and not Normalize, Convolution and not Convolve).ClassName) and functions and methods use snake case (e.g. function_name).alpha_1), we put an underscore before the suffix in snake case. The capitalized equivalent would be e.g. Alpha1.attention_scores and not attn_scores. There are a couple standardized exceptions to this rule, in particular dim for "dimension" and num for "number". These are sufficiently common that they are not ambiguous to a first-time reader.MyConstructor(
per_variable_sparsity_config=[
'layer_1/kernel:0.8', 'layer_2/kernel:1.5'])
What's wrong with this?
Possible alternative:
obj = MyConstructor()
obj.configure_sparsity(some_layer.kernel, value=0.8)
obj.configure_sparsity(some_other_layer.kernel, value=1.5)
What's nice about this?
Don't increase the cognitive load of common use cases for the sake of niche use cases, even minimally. Make sure that advanced users have a path to support their use case, even if this path requires the users to roll out plugins or other API extensions (in particular via subclassing). It is ok for advanced use cases not to be directly supported in the built-in API options.
Complex objects should be achievable by composing simple objects with few arguments, that do one thing reliably. There is a balance to strike between having complex signatures on fewer objects, and having more objects with simpler signatures. A good API has a reasonable number of objects, with reasonably simple signatures (see also: avoiding signatures with more than 6-7 arguments).
Things that create state or side-effects should be classes. Functions should be stateless. For instance, layers that create weights should not be cast as functions, since it makes the weights (and other elements of state) hard to access, impossible to update, and forces reliance on a global state capturing the side effects of layer-functions.
For instance, the optimizer API or the layers API should not contain arguments for configuring distributed training. That should go into the distribution API.
Documentation and error messages are an integral part of the API. Good docs and helpful error messages are key to a delightful user experience.
Note that Keras uses the following rules for writing docstrings:
Arguments: section in the class docstring, not in __init__.
MyLayer.__init__() method as if it were a regular method, they are calling MyLayer. We don't want to generate documentation for the __init__() method as a standalone method that needs to be called directly, that would be confusing. We also don't need __init__() docstrings that always start with "Initializes a MyLayer class.", which is useless information. Leaving __init__() without a docstring is the best practice.__init__, it forces us to programmatically copy the __init__ docstring when generating docs and concatenate it to the class docstring. This means that the Arguments section becomes the last thing in the docstring, which is bad.Applies Dropout to the input. Make sure the one-line description is useful. No Intantiates an ObscureName class instance.The Dropout layer randomly sets input units to 0 with a frequency of "rate" at each step during training time, which helps prevent overfitting. Inputs not set to 0 are scaled up by "1/(1 - rate)" such that the sum over all inputs is unchanged. [...]Arguments section.call, the Call arguments section.Layer, Input shape and Output shape sections.dtype attribute" in the base Layer class.The following would be a very poor error message:
AssertionError: '1 != 3'
In general, to validate user input, always use ValueError and avoid assert.
Also bad:
ValueError: 'Invalid target shape (600, 1).'
The following is better, but still not sufficient, because it does not tell the user what they passed, and does not quite say how to fix it:
ValueError: 'categorical_crossentropy requires target.shape[1] == classes'
Now, here's a good example, that says what was passed, what was expected, and how to fix the issue:
ValueError: '''You are passing a target array of shape (600, 1) while using as loss `categorical_crossentropy`.
`categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes).
If your targets are integer classes, you can convert them to the expected format via:
---
from keras.utils import to_categorical
y_binary = to_categorical(y_int)
---
Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.
When performing code reviews on pull requests, you must strictly adhere to the following principles in addition to the API design guidelines above:
Question the Necessity of Changes: Do not assume that the pull request changes are strictly necessary. Critically review the proposed changes to ensure they add real value. Point out any code that is solving a non-existent problem or adding unnecessary complexity.
Call out "AI Slop": Actively look for and identify "AI slop"—generic, overly verbose, or hallucinated code that lacks context or violates best practices. If you suspect the code is AI slop, explicitly call it out.
Poke Holes in the Implementation: Your goal is to critically test the logic. Actively search for and point out failing edge cases, race conditions, or unhandled exceptions in the implementation.
Demand Robustness: Do not accept fragile code. If the proposed code is not robust enough or lacks proper error handling, explicitly tell the author why the current approach is brittle and what must be done to reinforce it.
Respect Existing Repo Patterns: Before suggesting review comments (like asking users to add boilerplate or specific patterns), actively check for existing design patterns across the repository. Do not suggest adding useless code or structures that contradict or fall outside the established Keras repo coding style.