docs/_core_features/thinking.md
{: .d-inline-block .no_toc }
New in 1.10 {: .label .label-green }
{{ page.description }} {: .fs-6 .fw-300 }
{: .no_toc .text-delta }
After reading this guide, you will know:
with_thinkingExtended Thinking gives supported models more time and a larger computation budget to deliberate before answering. It can improve results on multi-step tasks like coding, math, and logic, at the expense of latency and cost. Some providers can also return a thinking trace or signature alongside the final answer.
Use with_thinking to control models that support thinking. Some models think by default, so with_thinking is for tuning (or disabling) rather than turning it on.
chat = RubyLLM.chat(model: 'claude-opus-4.5')
.with_thinking(effort: :high, budget: 8000)
response = chat.ask("What is 15 * 23?")
response.thinking&.text
response.thinking&.signature
response.content
with_thinking requires at least one of effort or budget:
chat.with_thinking(effort: :low)
chat.with_thinking(budget: 10_000)
chat.with_thinking(effort: :none)
Use effort to pick a qualitative depth (:low, :medium, :high) and budget for models that accept a token cap.
RubyLLM sends effort and budget exactly as provided. Check your provider's docs for supported values.
Thinking content is delivered alongside normal content in streaming chunks:
chat = RubyLLM.chat(model: 'claude-opus-4.5')
.with_thinking(effort: :medium)
chat.ask("Solve this step by step: What is 127 * 43?") do |chunk|
print chunk.thinking&.text
print chunk.content
end
Some providers only expose thinking in the final response. In those cases, response.thinking is populated after the stream completes, and chunk.thinking stays empty.
When using acts_as_chat and acts_as_message, thinking output is persisted to the message table:
# Migration (generated automatically with new installs)
# t.text :thinking_text
# t.text :thinking_signature
# t.integer :thinking_tokens
response = chat_record.ask("Explain quantum entanglement")
response.thinking&.text
response.thinking_tokens
For 1.10 upgrades, consider using the [upgrade guide]({% link _advanced/upgrading.md %}#upgrade-to-1-10) to run the generator. If you prefer manual migrations, add the columns to your message and tool calls tables:
class AddThinkingToMessages < ActiveRecord::Migration[7.1]
def change
add_column :messages, :thinking_text, :text
add_column :messages, :thinking_signature, :text
add_column :messages, :thinking_tokens, :integer
add_column :tool_calls, :thought_signature, :string
end
end
effort but may not return thinking text or signatures.<think> blocks inside content; RubyLLM extracts them after the response completes.with_thinking params. Non-magistral models warn if you pass them.effort: :none to disable thinking.