Back to Trl

Reward Functions

docs/source/rewards.md

1.3.0425 B
Original Source

Reward Functions

This module contains some useful reward functions, primarily intended for use with the [GRPOTrainer] and [RLOOTrainer].

accuracy_reward

[[autodoc]] rewards.accuracy_reward

reasoning_accuracy_reward

[[autodoc]] rewards.reasoning_accuracy_reward

think_format_reward

[[autodoc]] rewards.think_format_reward

get_soft_overlong_punishment

[[autodoc]] rewards.get_soft_overlong_punishment