docs/JsonSchema.md
Loro supports multiple data structures and introduces many new concepts. Having only binary export formats would make it difficult for developers to understand the underlying processes. Better transparency leads to better developer experience. A human-readable JSON representation enables users to better understand and operate the document and to develop related tools.
To better understand this document, you may first need to understand how Loro stores historical editing data:
It should be noted that considering the usage scenario, JSON Schema only supports backward compatibility but not forward compatibility.
The root object contains all Changes, Ops, and critical metadata like start/end versions and schema version.
We will also extract the 64-bit integer PeerID to the beginning of the document and replace it internally with incrementing numbers starting from zero: 0, 1, 2, 3... This significantly reduces the document size and enhances readability.
{
"schema_version": number,
"start_version": Map<string, number>,
"peers": string[],
"changes": Change[],
}
schema_version: the version of the schema that the document is encoded with. It's 1 for the current specification.start_version: the start Frontiers version of the document. They are represented as a map from the decimal string representation of PeerID to Counter.peers: the list of peers in the document. We represent all PeerIDs as decimal strings to avoid exceeding JavaScript's number limit.changes: the list of changes in the document.Changes are crucial in the OpLog. A REG(Replay event graph) is a directed acyclic graph where each node is a Change, and each edge is a causal dependency between Changes. The metadata of the Changes helps us reconstruct the graph.
You can also attach a commit message to a Change like you usually do with Git's commit.
{
"id": string,
"timestamp": number,
"deps": OpID[],
"lamport": number,
"msg": string,
"ops": Op[]
}
type OpID = `${number}@${PeerID}`;
id: the string representation of the unique ID of each Change, in the form of {Counter}@{PeerID} which is the @ character connecting Counter and PeerID. Of course, This PeerID is the index of peers in the global context.timestamp: the number of Unix timestamp when the change is committed. Timestamp is not recorded by defaultdeps: a list of causal dependency of this Change, each item is the ID represented by a string.lamport: the lamport timestamp of the Change.msg: the commit message.ops: all of the Op in the Change.Operation (abbreviated as Op) is the most complex part of the document. Loro currently supports multiple containers List, Map, RichText, Movable List and Movable Tree. Each data structure has several different Ops.
But in general, each Op is composed of the ContainerID of the container that created it, a counter, and the corresponding content of the Op.
type Op = {
"container": ContainerID,
"counter": number,
"content": OpContent // Its detailed definition is elaborated below, with different types for different Containers.
};
type OpContent = ListOp | TextOp | MapOp | TreeOp | MovableListOp | UnknownOp;
type ContainerID =
| `cid:root-${string}:${ContainerType}`
| `cid:${number}@${PeerID}:${ContainerType}`;
container: the ContainerID of the container that created this Op, represented by a string starts with cid:.counter: the counter part of the OpIDcontent: the semantic content of the Op, it is different for each field depending on the Container.The following is the content of each container。
type ListOp = ListInsertOp | ListDeleteOp;
type ListInsertOp = {
"type": "insert",
"pos": number,
"value": LoroValue
}
type: insert.pos: the index of the insert operation.value: the insert content which is a list of LoroValuetype ListDeleteOp = {
"type": "delete",
"pos": number,
"len": number,
"start_id": OpID
}
type: delete.pos: the start index of the deletion.len: the length of deleted content.start_id: the string id of start element deleted.type MovableListOp = ListInsertOp | ListDeleteOp | MovableListMoveOp | MovableListSetOp;
type ListInsertOp = {
"type": "insert",
"pos": number,
"value": LoroValue
}
type: insert,pos: the index of the insert operation.value: the insert content which is a list of LoroValuetype ListDeleteOp = {
"type": "delete",
"pos": number,
"len": number,
"start_id": OpID
}
type: deletepos: the start index of the deletion.len: the length of deleted content.start_id: the string id of start element deleted.type MovableListMoveOp = {
"type": "move",
"from": number,
"to": number,
"elem_id": ElemID
}
type ElemID = `L${number}@${PeerID}`
type:insert, delete, move or set.from: the index of the element before is moved.to: the index of the index moved to after moving out the elementelem_id: the ID (described by lamport@peer) of the element moved.type MovableListSetOp = {
"type": "set",
"elem_id": ElemID,
"value": LoroValue
}
type ElemID = `L${number}@${PeerID}`
type:insert, delete, move or set.elem_id: the ID (described by lamport@peer) of the element replaced.value: the value set.type MapOp = MapInsertOp | MapDeleteOp;
type MapInsertOp = {
"type": "insert",
"key": string,
"value": LoroValue
}
type: insert.key: the key of the insertion.value: the value of the insertion.type MapDeleteOp = {
"type": "delete",
"key": string
}
type: delete.key: the key of the deletiontype TextOp = TextInsertOp | TextDeleteOp | TextMarkOp | TextMarkEndOp;
type TextInsertOp = {
"type": "insert",
"pos": number,
"text": string
}
type: insert.
pos: the index of the insert operation. The position is based on the Unicode code point length.
text: the string of the insertion.
type TextDeleteOp = {
"type": "delete",
"pos": number,
"len": number,
"start_id": OpID
}
type: delete.
pos: the index of the deletion. The position is based on the Unicode code point length.
len: the length of the text deleted.
start_id: the string id of the beginning element deleted.
type TextMarkOp = {
"type": "mark",
"start": number,
"end": number,
"style_key": string,
"style_value": LoroValue,
"info": number
}
type: mark
start: the start index of text need to mark. The position is based on the Unicode code point length.
end: the end index of text need to mark. The position is based on the Unicode code point length.
style_key: the key of style, it is customizable.
style_value: the value of style, it is customizable.
info: the config of the style, whether to expand the style when inserting new text around it.
type TextMarkEndOp = {
"type": "mark_end"
}
type: mark_end.
type TreeOp = TreeCreateOp | TreeMoveOp | TreeDeleteOp;
type TreeCreateOp = {
"type": "create",
"target": TreeID,
"parent": TreeID | null,
"fractional_index": string
}
type TreeID = `${number}@${PeerID}`
type: create.target: the string format of target TreeID moved.parent: the string format of TreeID or null. If it is null, the target node will be a root node.fractional_index: the fractional index with hex string format of the target node.type TreeMoveOp = {
"type": "move",
"target": TreeID,
"parent": TreeID | null,
"fractional_index": string
}
type TreeID = `${number}@${PeerID}`
type: move.target: the string format of target TreeID moved.parent: the string format of TreeID or null. If it is null, the target node will be a root node.fractional_index: the fractional index with hex string format of the target node.type TreeDeleteOp = {
"type": "delete",
"target": TreeID
}
type TreeID = `${number}@${PeerID}`
type: delete.target: the string format of target TreeID deleted.To support forward compatibility, we have an unknown type. When an Op with a newly supported Container from a newer version is decoded into the older version, it will be treated as an unknown type in a more general form, such as binary and string. When the new version decodes an unknown Op, the newer version of Loro will know its true type and decode correctly.
type UnknownOp = {
"type": "unknown",
"prop": number,
"value_type": string,
"value": `${EncodeValue}`
}
type: just an unknown type.prop: a property of the encoded op, it's a number.value_type: the type of EncodeValue.value: common data types used in encoding with json string format.In this section, we will introduction two Value in Loro. One is LoroValue, it's an enum of data types supported by Loro, such as the value inserted by List or Map.
The another is EncodedValue, it's just used in encoding module for unknown type.
These are data types supported by Loro and its json format:
null: nullBool: true or falseF64: number(float)I64: number or bigint (signed)Binary: UInt8ArrayString: stringList: Array<LoroValue>Map: Map<string, LoroValue>Container: the id of container. 🦜:cid:{Counter}@{PeerID}:{ContainerType} or 🦜:cid:root-{Name}:{ContainerType}Note: Compared with the string format, we add a prefix 🦜: when encoding the json format of ContainerID to prevent users from saving the string format of ContainerID and misinterpreting it as ContainerID when decoding.
The EncodedValue is the specific type used by Loro when encoding, it's an internal value, users do not need to get it clear. It is specially designed to handle the schema mismatch due to forward and backward compatibility. In JSON encoding schema, the EncodedValue will be encoded as an object.