Back to Spree

Rich Text Descriptions

docs/plans/6.0-rich-text-descriptions.md

5.4.29.6 KB
Original Source

Rich Text Descriptions

Status: Partially Complete (description_html shipped in 5.4, ActionText removal in 6.0) Target: Spree 5.4 (description_html) + Spree 6.0 (ActionText removal) Depends on: Admin SPA (6.0-admin-spa.md) for ActionText removal Author: Damian + Claude Last updated: 2026-03-21

Summary

Standardize how rich text content (product descriptions, category descriptions, policy bodies) is stored and served across Spree. Move away from ActionText as a storage backend in favor of plain HTML text columns, since the 6.0 admin uses Tiptap (not Trix). Unify the API response shape so every rich text field returns both description (plain text) and description_html (HTML).

Problem

Rich text storage is inconsistent across models:

ModelCurrent storageTranslationAPI output
Producttext column on spree_productsMobility (table backend)description → raw HTML string
TaxonActionText (action_text_rich_texts)Mobility (action_text backend)description → plain text, description_html → HTML
PolicyActionText (action_text_rich_texts)Mobility (action_text backend)body → plain text, body_html → HTML

This creates three problems:

  1. ActionText is a Trix companion, not a generic HTML store. It sanitizes HTML against Trix's allowlist, which strips valid Tiptap output (tables, colored text, custom attributes). It wraps content in <div class="trix-content"> and uses <action-text-attachment> tags for embeds. None of this is compatible with a Tiptap editor.

  2. Inconsistent API responses. Product returns raw HTML as description. Taxon returns plain text as description and HTML as description_html. Consumers can't predict what format they'll get.

  3. Inconsistent storage. Two different storage mechanisms (text column vs action_text_rich_texts table) with different querying, indexing, and translation strategies.

Key Decisions (do not deviate without discussion)

  • Store HTML in text columns. Tiptap produces HTML via editor.getHTML(). Store that directly. No ActionText, no Tiptap JSON.
  • HTML, not Tiptap JSON. HTML is universally consumable (storefronts, emails, search indexing, feeds, mobile apps). Tiptap JSON would force every consumer to depend on a Tiptap renderer. The admin is the only writer — optimizing storage for one writer at the cost of every reader is wrong.
  • Use Nokogiri for plain text extraction. Nokogiri::HTML.fragment(html).text.squish strips tags without any ActionText dependency.
  • Every rich text field returns two API attributes: description (plain text) and description_html (HTML). Same pattern for body / body_html on Policy.
  • Sanitize HTML on write. Server-side allowlist sanitization to prevent XSS, configured to permit Tiptap's output (headings, paragraphs, lists, tables, images, links, text marks).
  • Models with plain-text-only descriptions keep simple text columns. Promotion, PaymentMethod, Store, PriceList, CustomerGroup descriptions are short admin labels — no rich text, no _html variant.

Design Details

Storage

All rich text fields are text columns on the model's own table, translated via Mobility table backend:

ruby
class Spree::Product < Spree.base_class
  TRANSLATABLE_FIELDS = %i[name description slug meta_description meta_title].freeze
  translates(*TRANSLATABLE_FIELDS, column_fallback: !Spree.always_use_translations?)
end

class Spree::Taxon < Spree.base_class
  TRANSLATABLE_FIELDS = %i[name pretty_name description permalink].freeze
  translates(*TRANSLATABLE_FIELDS, column_fallback: !Spree.always_use_translations?)
  # Remove: translates :description, backend: :action_text
end

class Spree::Policy < Spree.base_class
  TRANSLATABLE_FIELDS = %i[name body].freeze
  translates(*TRANSLATABLE_FIELDS, column_fallback: !Spree.always_use_translations?)
  # Remove: translates :body, backend: :action_text
end

Sanitization concern

ruby
module Spree
  module SanitizableRichText
    extend ActiveSupport::Concern

    class_methods do
      def sanitizes_rich_text(*attributes)
        before_save do
          attributes.each do |attr|
            value = public_send(attr)
            next if value.blank?

            public_send(:"#{attr}=", Spree::RichTextSanitizer.sanitize(value))
          end
        end
      end
    end
  end
end
ruby
module Spree
  class RichTextSanitizer
    ALLOWED_TAGS = %w[
      h1 h2 h3 h4 h5 h6 p br hr
      ul ol li
      table thead tbody tfoot tr th td
      a img
      strong em u s del sub sup
      blockquote pre code
      span div figure figcaption
    ].freeze

    ALLOWED_ATTRIBUTES = %w[
      href src alt title target rel
      class style
      colspan rowspan
      width height
    ].freeze

    def self.sanitize(html)
      Rails::HTML5::SafeListSanitizer.new.sanitize(
        html,
        tags: ALLOWED_TAGS,
        attributes: ALLOWED_ATTRIBUTES
      )
    end
  end
end

Usage in models:

ruby
class Spree::Product < Spree.base_class
  include Spree::SanitizableRichText
  sanitizes_rich_text :description
end

Plain text extraction

A shared helper for serializers using Nokogiri (no ActionText dependency):

ruby
module Spree
  module RichTextHelper
    def self.to_plain_text(html)
      return '' if html.blank?

      Nokogiri::HTML.fragment(html).text.squish
    end
  end
end

API serializers

Product (adding description_html, changing description to plain text):

ruby
class ProductSerializer < BaseSerializer
  typelize description: :string, description_html: :string

  attribute :description do |product|
    Spree::RichTextHelper.to_plain_text(product.description)
  end

  attribute :description_html do |product|
    product.description.to_s
  end
end

Category (no change to shape, but simplified implementation — no ActionText storage):

ruby
class CategorySerializer < BaseSerializer
  typelize description: :string, description_html: :string

  attribute :description do |category|
    Spree::RichTextHelper.to_plain_text(category.description)
  end

  attribute :description_html do |category|
    category.description.to_s
  end
end

Policy (same pattern with body / body_html):

ruby
class PolicySerializer < BaseSerializer
  typelize body: :string, body_html: :string

  attribute :body do |policy|
    Spree::RichTextHelper.to_plain_text(policy.body)
  end

  attribute :body_html do |policy|
    policy.body.to_s
  end
end

Admin API write

The admin API accepts HTML in the description (or body) parameter. The Tiptap editor calls editor.getHTML() on save and sends the result. No special endpoint or content type needed — it's just a string param.

Internal notes (Order, User)

has_rich_text :internal_note on Order and User stays on ActionText for now. These are admin-only fields edited within the Rails admin (or later in the Tiptap admin), never exposed in the Store API, and don't need translation. They can be migrated later if desired.

Migration Path

Phase 1: Sanitization concern + plain text helper

Add Spree::SanitizableRichText concern and Spree::RichTextHelper. Apply to Product immediately (it already stores HTML in a text column). No data migration needed.

Phase 2: Update Product serializer

  • Change description to return plain text via RichTextHelper.to_plain_text
  • Add description_html returning raw HTML
  • Update TypeScript types and Zod schemas

Phase 3: Migrate Taxon off ActionText

  1. Remove translates :description, backend: :action_text from Taxon
  2. Rake task: copy rich text content from action_text_rich_texts into the description column on spree_taxons (and Mobility translation records)
  3. Apply sanitizes_rich_text :description
  4. Update CategorySerializer to use RichTextHelper instead of .to_plain_text / .body.to_s

Phase 4: Migrate Policy off ActionText

Same as Phase 3, but for body field.

Phase 5: Cleanup

  • Remove orphaned action_text_rich_texts records for migrated models
  • ActionText gem stays as a dependency for models that still use it (Order internal_note, User internal_note, Metafields::RichText). Plain text extraction uses Nokogiri directly — no ActionText dependency for that path.
  • Run type generation pipeline (typelizer → Zod → OpenAPI → SDK tests)

Constraints on Current Work

  • Do not add new has_rich_text or backend: :action_text declarations. New rich text fields should use text columns + SanitizableRichText concern.
  • New serializers with rich text must return both field (plain text) and field_html (HTML).
  • Do not store Tiptap JSON. Always convert to HTML before saving.

Open Questions

  • style and class attributes in sanitizer allowlist. Allowing arbitrary style attributes opens the door to CSS-based attacks (e.g., position: fixed overlays). Allowing class could conflict with storefront CSS or be exploited if malicious classes are defined. May want to restrict both to specific values (e.g., Tiptap's text-align classes) or strip them entirely and rely on semantic HTML tags.
  • Image handling. Tiptap image nodes produce ``. Need to decide whether images in descriptions are uploaded to Active Storage (and URLs rewritten) or stored as external URLs. This intersects with the admin SPA media management.

References

  • Related plan: 6.0-admin-spa.md (Tiptap editor in React admin)
  • Current ActionText models: Taxon (description), Policy (body), Order (internal_note), User (internal_note), Metafields::RichText (value)
  • Tiptap HTML output docs: https://tiptap.dev/docs/editor/guide/output#html