Back to Spree

Imports & Exports

docs/developer/core-concepts/imports-exports.mdx

5.5.019.4 KB
Original Source

Overview

Spree provides a comprehensive bulk data import and export system for managing large datasets. The system supports CSV file processing with configurable field mapping, asynchronous processing via background jobs, and real-time progress tracking in the admin interface.

Import/Export System Diagram

mermaid
erDiagram
    Import {
        string number
        string type
        string status
        bigint owner_id
        string owner_type
        bigint user_id
    }

    ImportMapping {
        string schema_field
        string file_column
    }

    ImportRow {
        integer row_number
        text data
        string status
        text validation_errors
    }

    ImportSchema {
        array fields
    }

    RowProcessor {
        hash attributes
    }

    Export {
        string number
        string type
        integer format
        jsonb search_params
    }

    Store {
        string name
    }

    Import ||--|| Store : "belongs to (owner)"
    Import ||--o{ ImportMapping : "has many"
    Import ||--o{ ImportRow : "has many"
    Import ||--|| ImportSchema : "uses"
    ImportRow ||--|| RowProcessor : "processed by"
    Export ||--|| Store : "belongs to"

Architecture

The import/export system uses several design patterns:

  1. Single Table Inheritance (STI): Import and Export types inherit from base classes
  2. State Machine: Imports progress through states (pending → mapping → processing → completed)
  3. Schema Definition: ImportSchema classes define expected fields and validation
  4. Row Processors: Transform CSV rows into database records
  5. Event-Driven Processing: Background jobs handle heavy lifting asynchronously
  6. Registry Pattern: Types registered in Spree.import_types and Spree.export_types

Exports

Exports generate CSV files from filtered database records.

Built-in Export Types

TypeDescriptionMulti-line
Spree::Exports::ProductsProducts with all variantsYes
Spree::Exports::ProductTranslationsProduct translations for non-default store localesYes
Spree::Exports::OrdersOrders with line itemsYes
Spree::Exports::CustomersCustomer accountsNo
Spree::Exports::GiftCardsGift cardsNo
Spree::Exports::NewsletterSubscribersNewsletter subscribersNo
Spree::Exports::CouponCodesPromotion coupon codesNo

Export Model

The base Spree::Export class provides:

ruby
module Spree
  class Export < Spree.base_class
    # Associations
    belongs_to :store
    belongs_to :user  # Admin who created export

    # Attachments
    has_one_attached :attachment  # Generated CSV file

    # Key methods
    def csv_headers           # Define column headers
    def scope                 # Base query with store/vendor filtering
    def scope_includes        # Eager loading associations
    def records_to_export     # Apply ransack filters
    def multi_line_csv?       # True if records produce multiple rows
    def generate              # Create CSV and attach file
  end
end

Creating a Custom Exporter

Step 1: Create the Export Class

ruby
module Spree
  module Exports
    class Subscriptions < Spree::Export
      # Define CSV column headers
      def csv_headers
        %w[id email plan_name status created_at] + metafields_headers
      end

      # Eager load associations to avoid N+1 queries
      def scope_includes
        [:user, :plan, { metafields: :metafield_definition }]
      end

      # Override scope if needed (e.g., exclude cancelled)
      def scope
        super.where.not(status: 'cancelled')
      end

      # Set to true if each record produces multiple CSV rows
      def multi_line_csv?
        false
      end
    end
  end
end

Step 2: Add to_csv Method to Your Model

ruby
module Spree
  class Subscription < Spree.base_class
    def to_csv(store)
      [
        id,
        user&.email,
        plan&.name,
        status,
        created_at.iso8601
      ] + metafields_csv_values(store)
    end

    private

    def metafields_csv_values(store)
      Spree::MetafieldDefinition.for_resource_type(self.class.name).order(:namespace, :key).map do |definition|
        metafields.find { |m| m.metafield_definition_id == definition.id }&.value
      end
    end
  end
end

Step 3: Register the Export Type

ruby
Rails.application.config.after_initialize do
  Spree.export_types << Spree::Exports::Subscriptions
end

Step 4: Add Translations

yaml
en:
  spree:
    subscriptions: Subscriptions

Multi-line Exports

For exports where each record produces multiple rows (like products with variants):

ruby
module Spree
  module Exports
    class OrdersWithItems < Spree::Export
      def multi_line_csv?
        true
      end

      def csv_headers
        %w[order_number line_item_sku quantity price]
      end

      def scope_includes
        [line_items: :variant]
      end
    end
  end
end
ruby
# In the Order model
def to_csv(store)
  line_items.map do |item|
    [number, item.variant.sku, item.quantity, item.price]
  end
end

Export Filtering

Exports support Ransack filtering via search_params:

ruby
# In admin, users can filter before exporting
export = Spree::Exports::Products.new(
  store: current_store,
  user: current_user,
  search_params: { name_cont: 'shirt', status_eq: 'active' }.to_json,
  record_selection: 'filtered'  # or 'all' to ignore filters
)

Imports

Imports process CSV files to create or update database records.

Built-in Import Types

TypeDescription
Spree::Imports::ProductsProducts and variants
Spree::Imports::ProductTranslationsProduct translations (matched by slug)
Spree::Imports::CustomersCustomers

Import Workflow

1. Upload CSV → pending
2. Auto-map columns → mapping
3. User confirms mapping → completed_mapping
4. Parse rows (CreateRowsJob) → processing
5. Process rows (ProcessRowsJob) → completed/failed

Import Components

Import Model

ruby
module Spree
  class Import < Spree.base_class
    # Associations
    belongs_to :owner, polymorphic: true  # Store or Vendor
    belongs_to :user
    has_many :mappings    # Field mappings
    has_many :rows        # CSV rows to process

    # State machine
    state_machine initial: :pending do
      event :start_mapping do
        transition to: :mapping
      end
      event :complete_mapping do
        transition from: :mapping, to: :completed_mapping
      end
      event :start_processing do
        transition from: :completed_mapping, to: :processing
      end
      event :complete do
        transition from: :processing, to: :completed
      end
      event :fail do
        transition to: :failed
      end
    end

    # Key methods
    def import_schema        # Returns schema class instance
    def row_processor_class  # Returns processor class
    def schema_fields        # Fields from schema + metafields
    def mapping_done?        # All required fields mapped?
  end
end

Import Schema

Defines expected CSV fields:

ruby
module Spree
  class ImportSchema
    FIELDS = []

    def fields
      self.class::FIELDS
    end

    def required_fields
      FIELDS.select { |f| f[:required] }.map { |f| f[:name] }
    end

    def optional_fields
      FIELDS.reject { |f| f[:required] }.map { |f| f[:name] }
    end
  end
end

Import Mapping

Maps CSV columns to schema fields:

ruby
module Spree
  class ImportMapping < Spree.base_class
    belongs_to :import

    # Attributes
    # schema_field - target field name from schema
    # file_column  - CSV column header

    def try_to_auto_assign_file_column(csv_headers)
      # Matches by parameterized name comparison
      self.file_column = csv_headers.find do |header|
        header.parameterize.underscore == schema_field.parameterize.underscore
      end
    end
  end
end

Import Row

Represents a single CSV row:

ruby
module Spree
  class ImportRow < Spree.base_class
    belongs_to :import, counter_cache: :rows_count
    belongs_to :item, polymorphic: true, optional: true  # Created record

    # Attributes
    # row_number       - position in CSV
    # data             - JSON-serialized row data
    # status           - pending/processing/completed/failed
    # validation_errors - error message if failed

    def process!
      start_processing!
      self.item = import.row_processor_class.new(self).process!
      complete!
    rescue StandardError => e
      self.validation_errors = e.message
      fail!
    end

    def to_schema_hash
      # Maps CSV data using import.mappings
    end
  end
end

Row Processor

Transforms row data into database records:

ruby
module Spree
  module Imports
    module RowProcessors
      class Base
        def initialize(row)
          @row = row
          @import = row.import
          @attributes = row.to_schema_hash
        end

        attr_reader :row, :import, :attributes

        def process!
          raise NotImplementedError
        end
      end
    end
  end
end

Creating a Custom Importer

Step 1: Create the Import Class

ruby
module Spree
  module Imports
    class Subscriptions < Spree::Import
      def row_processor_class
        Spree::Imports::RowProcessors::Subscription
      end
    end
  end
end

Step 2: Define the Schema

ruby
module Spree
  module ImportSchemas
    class Subscriptions < Spree::ImportSchema
      FIELDS = [
        { name: 'email', label: 'Customer Email', required: true },
        { name: 'plan_name', label: 'Plan Name', required: true },
        { name: 'status', label: 'Status', required: true },
        { name: 'start_date', label: 'Start Date' },
        { name: 'billing_interval', label: 'Billing Interval' },
        { name: 'amount', label: 'Amount' },
        { name: 'currency', label: 'Currency' }
      ].freeze
    end
  end
end

Step 3: Create the Row Processor

ruby
module Spree
  module Imports
    module RowProcessors
      class Subscription < Base
        def process!
          user = find_or_create_user
          plan = find_plan

          subscription = Spree::Subscription.find_or_initialize_by(
            user: user,
            plan: plan
          )

          subscription.status = attributes['status'] if attributes['status'].present?
          subscription.start_date = parse_date(attributes['start_date']) if attributes['start_date'].present?
          subscription.billing_interval = attributes['billing_interval'] if attributes['billing_interval'].present?

          if attributes['amount'].present?
            currency = attributes['currency'].presence || import.store.default_currency
            subscription.set_price(currency, attributes['amount'])
          end

          subscription.save!
          subscription
        end

        private

        def find_or_create_user
          email = attributes['email'].strip.downcase
          Spree.user_class.find_or_create_by!(email: email)
        end

        def find_plan
          Spree::Plan.find_by!(name: attributes['plan_name'].strip)
        end

        def parse_date(date_string)
          Date.parse(date_string)
        rescue ArgumentError
          nil
        end
      end
    end
  end
end

Step 4: Register the Import Type

ruby
Rails.application.config.after_initialize do
  Spree.import_types << Spree::Imports::Subscriptions
end

Step 5: Add Translations

yaml
en:
  spree:
    subscriptions: Subscriptions

Products Import Schema

The built-in products import supports these fields:

Required Fields:

  • slug - Product URL slug
  • sku - Variant SKU
  • name - Product name
  • price - Variant price

Optional Fields:

  • status - Product status (active/draft/archived)
  • description - Product description
  • meta_title, meta_description, meta_keywords - SEO metadata
  • tags - Product tags
  • compare_at_price - Original price for sale display
  • currency - Price currency
  • width, height, depth, dimensions_unit - Dimensions
  • weight, weight_unit - Weight
  • available_on, discontinue_on - Availability dates
  • track_inventory - Enable inventory tracking
  • inventory_count, inventory_backorderable - Stock settings
  • tax_category, shipping_category - Category assignments
  • image1_src, image2_src, image3_src - Image URLs
  • option1_name, option1_value through option3_name, option3_value - Variant options
  • category1, category2, category3 - Taxon assignments (format: "Taxonomy -> Taxon -> Child Taxon")

Handling Multi-Variant Products

The products import handles variants intelligently:

  1. Master variant rows (no option values): Create/update the product and its master variant
  2. Non-master variant rows (with option values): Create additional variants for an existing product
csv
slug,sku,name,price,option1_name,option1_value,option2_name,option2_value
my-tshirt,TSHIRT-001,My T-Shirt,29.99,,,,
my-tshirt,TSHIRT-S-RED,My T-Shirt,29.99,Size,Small,Color,Red
my-tshirt,TSHIRT-M-RED,My T-Shirt,29.99,Size,Medium,Color,Red
my-tshirt,TSHIRT-L-RED,My T-Shirt,29.99,Size,Large,Color,Red

Metafield Support

Both imports and exports support metafields dynamically:

Export: Metafield definitions are automatically added as CSV columns using the format metafield.{namespace}.{key}.

Import: Map CSV columns to metafield definitions. The system automatically detects columns matching the metafield pattern and updates the corresponding metafield values.


Background Jobs

CreateRowsJob

Parses CSV and creates ImportRow records:

All import jobs inherit from Spree::Imports::BaseJob, which sets queue_as Spree.queues.imports once and is shared across the pipeline.

ruby
module Spree
  module Imports
    class CreateRowsJob < Spree::Imports::BaseJob
      def perform(import_id)
        import = Spree::Import.find(import_id)
        # Stream CSV, batch insert rows
        # Then enqueue ProcessRowsJob via import.process_rows_async
      end
    end
  end
end

ProcessRowsJob

Fans the pending rows out into groups, each processed by a separate ProcessGroupJob:

ruby
module Spree
  module Imports
    class ProcessRowsJob < Spree::Imports::BaseJob
      BATCH_SIZE = 100

      def perform(import_id)
        import = Spree::Import.find(import_id)
        dispatch_groups(import)
      end

      private

      def dispatch_groups(import)
        # When the import defines a group_column (e.g. ProductTranslations groups
        # by slug) and that field is mapped, group rows by the column value.
        # Otherwise, fall back to slicing pending_and_failed rows into batches of
        # BATCH_SIZE. Either way: pluck the row IDs, set processing_groups_count /
        # completed_groups_count up front so workers can't complete prematurely,
        # then enqueue one ProcessGroupJob per group.
        import.rows.pending_and_failed.in_batches(of: BATCH_SIZE) do |batch|
          ProcessGroupJob.perform_later(import.id, batch.ids)
        end
      end
    end
  end
end

ProcessGroupJob

Processes one group of rows and reports completion:

ruby
module Spree
  module Imports
    class ProcessGroupJob < Spree::Imports::BaseJob
      def perform(import_id, row_ids)
        import = Spree::Import.find(import_id)
        rows = import.rows.where(id: row_ids).pending_and_failed

        # Large imports process with events disabled and use bulk_process!;
        # otherwise each row is processed individually via row.process!.
        rows.each { |row| row.process! }

        check_import_completion(import)
      end

      private

      # Increments completed_groups_count, then completes the import only once the
      # last group has finished and no rows are still in flight.
      def check_import_completion(import)
        # completed_groups_count += 1
        import.complete! if import.completed_groups_count >= import.processing_groups_count &&
                            import.rows.in_flight.none?
      end
    end
  end
end

GenerateJob (Exports)

Generates CSV files for exports:

ruby
module Spree
  module Exports
    class GenerateJob < Spree::BaseJob
      queue_as Spree.queues.exports

      def perform(export_id)
        export = Spree::Export.find_by_prefix_id!(export_id)
        export.generate
      end
    end
  end
end

Events

Import and export lifecycle events such as import.completed and export.created can be delivered as webhooks to react when an asynchronous import or export finishes.

Import Events

EventTrigger
import.createdImport record created
import.completedAll rows processed

Export Events

EventTrigger
export.createdExport record created (triggers generation)

Import Row Events

EventTrigger
import_row.completedRow processed successfully
import_row.failedRow processing failed

Configuration

Queue Configuration

ruby
# Configure job queues
Spree.queues.imports = :imports
Spree.queues.exports = :exports

Preferences

Imports support configurable delimiter:

ruby
import.preferred_delimiter = ';'  # Default: ','

Key Files Reference

FilePurpose
core/app/models/spree/import.rbBase import model
core/app/models/spree/export.rbBase export model
core/app/models/spree/import_schema.rbBase schema class
core/app/models/spree/import_mapping.rbField mapping model
core/app/models/spree/import_row.rbRow model with processing
core/app/models/spree/imports/products.rbProducts import type
core/app/models/spree/import_schemas/products.rbProducts schema
core/app/services/spree/imports/row_processors/base.rbBase processor
core/app/services/spree/imports/row_processors/product_variant.rbProducts processor
core/app/models/spree/exports/products.rbProducts export
core/app/models/spree/exports/orders.rbOrders export
core/app/models/spree/exports/customers.rbCustomers export
core/app/jobs/spree/imports/create_rows_job.rbRow creation job
core/app/jobs/spree/imports/process_rows_job.rbRow processing job
core/app/jobs/spree/exports/generate_job.rbExport generation job
admin/app/controllers/spree/admin/imports_controller.rbAdmin imports controller
admin/app/controllers/spree/admin/exports_controller.rbAdmin exports controller

Permissions

Access is controlled via CanCanCan:

ruby
# Allow admin to manage imports/exports
can :manage, Spree::Import
can :manage, Spree::Export

Records are filtered by current_ability ensuring users only export data they have access to.