Back to Pytorch

Activation Functions

docs/cpp/source/api/nn/activation.md

2.12.05.1 KB
Original Source

Activation Functions

Activation functions introduce non-linearity into neural networks, allowing them to learn complex patterns. Without activations, stacked linear layers would collapse into a single linear transformation.

Common choices:

  • ReLU family (ReLU, LeakyReLU, PReLU, RReLU): Fast, widely used, good default choice
  • ELU family (ELU, SELU, CELU): Smoother than ReLU, can produce negative outputs
  • GELU/SiLU/Mish: Modern activations popular in transformers and advanced architectures
  • Sigmoid/Tanh: Classic activations, useful for output layers (probabilities, bounded outputs)
  • Softmax: Converts logits to probability distribution (classification output)

ReLU

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Example:

cpp
auto relu = torch::nn::ReLU(torch::nn::ReLUOptions().inplace(true));

LeakyReLU

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

PReLU

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

RReLU

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

ReLU6

Like ReLU but caps the output at 6: min(max(0, x), 6). Commonly used in mobile architectures (MobileNet).

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

GLU

Gated Linear Unit. Splits the input tensor in half along a dimension, then applies a * sigmoid(b).

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

LogSigmoid

Applies element-wise log(sigmoid(x)). Numerically more stable than computing log and sigmoid separately.

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

ELU

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

SELU

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

CELU

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

GELU

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

SiLU (Swish)

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Mish

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Sigmoid

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Tanh

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Softmax

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Example:

cpp
auto softmax = torch::nn::Softmax(torch::nn::SoftmaxOptions(/*dim=*/1));

Softmax2d

Applies Softmax over features to each spatial location in a 4D input tensor of shape (N, C, H, W).

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

LogSoftmax

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Softmin

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Softplus

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Softshrink

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Softsign

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Hardshrink

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Hardtanh

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Tanhshrink

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members:

Threshold

{doxygenclass}
:members:
:undoc-members:
{doxygenclass}
:members:
:undoc-members: