Back to Mruby

mruby Architecture

doc/internal/architecture.md

4.0.07.8 KB
Original Source
<!-- summary: About mruby Architecture -->

mruby Architecture

This document provides a map of mruby's internals for developers who want to understand, debug, or contribute to the codebase.

Overview

mruby's execution pipeline:

text
Ruby source → Parser → AST → Code Generator → Bytecode (irep)
                                                    ↓
                                                   VM → Result

The design priority is memory > performance > readability.

Object Model

All heap-allocated Ruby objects share a common header (MRB_OBJECT_HEADER):

text
struct RBasic (8 bytes on 64-bit)
┌──────────────┬─────┬──────────┬────────┬───────┐
│ RClass *c    │ tt  │ gc_color │ frozen │ flags │
│ (class ptr)  │ 8b  │ 3b       │ 1b     │ 20b   │
└──────────────┴─────┴──────────┴────────┴───────┘

All object structs embed this header via MRB_OBJECT_HEADER:

StructRuby TypeExtra Fields
RObjectObject instancesiv (instance variables)
RClassClass/Moduleiv, mt (method table), super
RStringStringembedded or heap buffer, length
RArrayArrayembedded or heap buffer, length
RHashHashhash table or k-v array
RProcProc/Lambdairep or C function, environment
RDataC data wrappervoid *data, mrb_data_type
RFiberFibermrb_context
RExceptionExceptioniv

Immediate values (Integer, Symbol, true, false, nil) are encoded directly in mrb_value without heap allocation. The encoding depends on the boxing mode (see boxing.md).

Objects must fit within 5 words (mrb_static_assert_object_size).

Virtual Machine

The VM is register-based, using two stacks: a value stack for registers (locals, temporaries, arguments) and a call info stack for tracking method/block call frames. Each method call pushes a mrb_callinfo frame with the method symbol, proc, PC, and argument counts.

The dispatch loop in mrb_vm_run() decodes opcodes and operates on registers. Method dispatch looks up the receiver's class method table (with a per-state method cache), then either calls a C function directly or pushes a new call frame for Ruby methods.

Exception handling uses setjmp/longjmp (or C++ exceptions if configured). Rescue/ensure handler tables are stored in each irep and searched during stack unwinding.

See vm.md for detailed VM internals, opcode.md for the full instruction set.

Garbage Collector

The GC uses tri-color incremental mark-and-sweep with an optional generational mode. Objects are colored white (unmarked), gray (marked, children pending), black (fully marked), or red (static/ROM).

The three-phase cycle (root scan, incremental marking, sweep) runs in small steps between VM instructions to avoid long pauses. Write barriers (mrb_field_write_barrier, mrb_write_barrier) maintain correctness during incremental marking.

The GC arena protects newly created objects in C code. Heap regions (mrb_gc_add_region) support embedded systems with fixed memory banks.

See gc.md for detailed GC internals, ../guides/gc-arena-howto.md for arena usage patterns, ../guides/memory.md for memory management.

Compiler Pipeline

The compiler transforms Ruby source code through three stages:

  1. Parser (parse.y): Lrama/Bison grammar produces an AST of mrb_ast_node structures, tracking lexer state and local scopes.
  2. Code Generator (codegen.c): walks the AST and emits bytecode into mrb_irep structures (instruction sequence, literal pool, symbol table, child ireps).
  3. Execution: the irep is wrapped in an RProc and executed by the VM, or serialized to .mrb binary format.

Alternative loading paths include mrb_load_string() (compile and run), mrb_load_irep() (load precompiled bytecode), and mrbc (ahead-of-time compilation).

See compiler.md for detailed compiler internals, opcode.md for the instruction set.

Source File Map

Core (src/)

FileResponsibility
vm.cBytecode dispatch loop, method invocation
state.cmrb_state init/close, irep management
gc.cGarbage collector (mark-sweep, incremental)
class.cClass/module definition, method tables
object.cCore object operations
variable.cInstance/class/global variables, object shapes
proc.cProc/Lambda/closure handling
array.cArray implementation
string.cString implementation (embedded, shared, heap)
hash.cHash implementation (open addressing)
numeric.cInteger/Float arithmetic
symbol.cSymbol table and interning
range.cRange implementation
error.cException creation, raise, backtrace
kernel.cKernel module methods
load.c.mrb bytecode loading
dump.cBytecode serialization (write .mrb)
print.cPrint/puts/p output
backtrace.cStack trace generation

Compiler (mrbgems/mruby-compiler/core/)

FileResponsibility
parse.yYacc grammar → AST
y.tab.cGenerated parser (from parse.y)
codegen.cAST → bytecode (irep)
node.hAST node type definitions

Key Headers (include/mruby/)

HeaderContents
mruby.hmrb_state, core API declarations
value.hmrb_value, type enums, value macros
object.hRBasic, RObject, object header
class.hRClass, method table types
string.hRString, string macros
array.hRArray, array macros
hash.hRHash, hash API
data.hRData, C data wrapping
irep.hmrb_irep, bytecode structures
compile.hCompiler context, mrb_load_string
boxing_*.hValue boxing implementations

mrbgems System

Gems are the module system for mruby. Each gem lives in mrbgems/mruby-*/ and contains:

text
mruby-example/
├── mrbgem.rake       gem specification (name, deps, bins)
├── src/              C source files
├── mrblib/           Ruby source files (compiled to bytecode)
├── include/          C headers
├── test/             mrbtest test files
└── bintest/          binary test files (CRuby)

At build time, gem Ruby files are compiled with mrbc and linked into libmruby.a. Gem initialization runs in dependency order via gem_init.c (auto-generated).

GemBoxes (mrbgems/*.gembox) define named collections of gems (e.g., default.gembox includes stdlib, stdlib-ext, stdlib-io, math, metaprog, and binary tools).