libs/mysql/serialization/readme.md
\page PageLibsMysqlSerialization Library: Serialization
<!--- Copyright (c) 2023, 2025, Oracle and/or its affiliates. // This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2.0, as published by the Free Software Foundation. // This program is designed to work with certain software (including but not limited to OpenSSL) that is licensed under separate terms, as designated in a particular file or component or in included license documentation. The authors of MySQL hereby grant you an additional permission to link the program and your derivative works with the separately licensed software that they have either included with the program or referenced in the documentation. // This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License, version 2.0, for more details. // You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA --> <!-- MySQL Library: Serialization ============================ -->Code documentation: @ref GroupLibsMysqlSerialization.
Serialization framework provides methods for automatic serialization and deserialization of event fields defined by the API user.
Serialization framework is designed to expose a simple API that facilitates event definition, serialization and deserialization. The user instead of implementing serializing and deserializing functions, specifies definitions of fields included in the packet. Field definition considers definition of:
The idea of the serialization framework is not to introduce many new types that can be ingested by encoding and decoding functions, but to reuse common STL types. Types supported by serialization framework are:
Important design decisions are listed below:
Serialization framework types are encoded using the following formats:
Message formatting is the following:
{
<message_field> ::= <serialization_version_number> <message_format>
}
message_field consists of:
{
<message_format> ::= <serializable_field_size> <last_non_ignorable_field_id> { <type_field> }
<type_field> ::= <field_id> <field_data>
}
message_format consists of:
Type field consists of:
{
<field_data> ::= <fixlen_integer_format> | <varlen_integer_format> | <floating_point_integer_format> | <string_format> | <container_format> | <fixed_container_format> | <map_format> | <message_format>
<fp_number> ::= <sp_floating_point_integer_format> | <dp_floating_point_integer_format>
}
field_data can be one of:
{
<map_format> ::= <number_elements> { <field_data> <field_data> }
}
Map format consists of:
{
<container_format> ::= <number_elements> { <field_data> }
}
Vector format consists of:
{
<fixed_container_format> ::= { <field_data> }+
}
Fixed-size array format consists of:
{
<string_format> ::= <string_length> { <character> }
}
String format consists of:
Variable-length integers are encoded using 1-9 bytes, depending on the value of a particular field. Bytes are always stored using in LE byte order.
This format allows to implement decoding without looping through the consecutive bytes.
For each number, rightmost contains encoded information about how many consecutive bytes represent the number - it is equal to the number of encoded trailing ones + 1. When the number itself uses at most 56 bits, then the trailing ones are followed by a bit equal to "0"; otherwise it is followed by the full number. The special case for 57..64 bits allows us to use only 9 bytes to store numbers where the 63'rd bit is "1". (An encoding without such a special case would require 10 bytes.) For readability, in text below, we display bytes in big-endian format. Within each byte, we display the most significant bit first. Encoding is explained reading bits from right to left.
unsigned integers: Encoded length of the integer is followed by encoded value. Examples:
"00000111 11111111 11111011" - BE byte order, bit layout: most significant bit first
65535 is represented using 3 bytes. The rightmost byte contains two trailing ones followed by 0 (3 least significant bits - 3 bytes used to store the number). The latter bits are used to store the value.
signed integers: Signed integers are encoded such that both positive and negative numbers use fewer bits the smaller their magnitude is. The least significant bit is the sign and the remaining bits represent the number. If x is positive, then we encode (x<<1), cast to unsigned. If x is negative, then we encode ((-(x + 1) << 1) | 1), cast to unsigned. That is, we first add 1 to shift the range from -2^63..-1 to -(2^63-1)..0; then we negate the result to get a nonnegative number in the range 0..2^63-1; then we shift the result left by 1 to make place for the sign bit; then we "or" with the sign bit. The resulting number is reinterpreted as unsigned and serialized accordingly.
"00001111 11111111 11110011" - BE byte order, bit layout: starting from the most significant bit
65535 is represented using 3 bytes. The rightmost byte contains two trailing ones followed by 0 (3 least significant bits - 3 bytes used to store the number). After 0 encoded is one sign bit (equal to 0). The latter bits are used to store the value.
"00001111 11111111 11101011" - BE byte order, bit layout: starting from the most significant bit
-65535 is represented using 3 bytes. The rightmost byte contains two trailing ones followed by 0 (3 least significant bits - 3 bytes used to store the number). After 0 encoded is one sign bit (equal to 1). The latter bits are used to store the value.
"00001111 11111111 11111011" - BE byte order, bit layout: starting from the most significant bit
-65536 is represented using 3 bytes. The rightmost byte contains two trailing ones followed by 0 (3 least significant bits - 3 bytes used to store the number). After 0 encoded is one sign bit (equal to 1). The latter bits are used to store the value.