Back to Clickhouse

BSONEachRow

docs/en/interfaces/formats/BSONEachRow.md

26.4.1.1-new16.0 KB
Original Source
InputOutputAlias

Description {#description}

The BSONEachRow format parses data as a sequence of Binary JSON (BSON) documents without any separator between them. Each row is formatted as a single document and each column is formatted as a single BSON document field with the column name as a key.

Data types matching {#data-types-matching}

For output it uses the following correspondence between ClickHouse types and BSON types:

ClickHouse typeBSON Type
Bool\x08 boolean
Int8/UInt8/Enum8\x10 int32
Int16/UInt16/Enum16\x10 int32
Int32\x10 int32
UInt32\x12 int64
Int64/UInt64\x12 int64
Float32/Float64\x01 double
Date/Date32\x10 int32
DateTime\x12 int64
DateTime64\x09 datetime
Decimal32\x10 int32
Decimal64\x12 int64
Decimal128\x05 binary, \x00 binary subtype, size = 16
Decimal256\x05 binary, \x00 binary subtype, size = 32
Int128/UInt128\x05 binary, \x00 binary subtype, size = 16
Int256/UInt256\x05 binary, \x00 binary subtype, size = 32
String/FixedString\x05 binary, \x00 binary subtype or \x02 string if setting output_format_bson_string_as_string is enabled
UUID\x05 binary, \x04 uuid subtype, size = 16
Array\x04 array
Tuple\x04 array
Named Tuple\x03 document
Map\x03 document
IPv4\x10 int32
IPv6\x05 binary, \x00 binary subtype

For input it uses the following correspondence between BSON types and ClickHouse types:

BSON TypeClickHouse Type
\x01 doubleFloat32/Float64
\x02 stringString/FixedString
\x03 documentMap/Named Tuple
\x04 arrayArray/Tuple
\x05 binary, \x00 binary subtypeString/FixedString/IPv6
\x05 binary, \x02 old binary subtypeString/FixedString
\x05 binary, \x03 old uuid subtypeUUID
\x05 binary, \x04 uuid subtypeUUID
\x07 ObjectIdString/FixedString
\x08 booleanBool
\x09 datetimeDateTime64
\x0A null valueNULL
\x0D JavaScript codeString/FixedString
\x0E symbolString/FixedString
\x10 int32Int32/UInt32/Decimal32/IPv4/Enum8/Enum16
\x12 int64Int64/UInt64/Decimal64/DateTime64

Other BSON types are not supported. Additionally, it performs conversion between different integer types. For example, it is possible to insert a BSON int32 value into ClickHouse as UInt8.

Big integers and decimals such as Int128/UInt128/Int256/UInt256/Decimal128/Decimal256 can be parsed from a BSON Binary value with the \x00 binary subtype. In this case, the format will validate that the size of the binary data equals the size of the expected value.

:::note This format does not work properly on Big-Endian platforms. :::

Example usage {#example-usage}

Inserting data {#inserting-data}

Using a BSON file with the following data, named as football.bson:

text
    ┌───────date─┬─season─┬─home_team─────────────┬─away_team───────────┬─home_team_goals─┬─away_team_goals─┐
 1. │ 2022-04-30 │   2021 │ Sutton United         │ Bradford City       │               1 │               4 │
 2. │ 2022-04-30 │   2021 │ Swindon Town          │ Barrow              │               2 │               1 │
 3. │ 2022-04-30 │   2021 │ Tranmere Rovers       │ Oldham Athletic     │               2 │               0 │
 4. │ 2022-05-02 │   2021 │ Port Vale             │ Newport County      │               1 │               2 │
 5. │ 2022-05-02 │   2021 │ Salford City          │ Mansfield Town      │               2 │               2 │
 6. │ 2022-05-07 │   2021 │ Barrow                │ Northampton Town    │               1 │               3 │
 7. │ 2022-05-07 │   2021 │ Bradford City         │ Carlisle United     │               2 │               0 │
 8. │ 2022-05-07 │   2021 │ Bristol Rovers        │ Scunthorpe United   │               7 │               0 │
 9. │ 2022-05-07 │   2021 │ Exeter City           │ Port Vale           │               0 │               1 │
10. │ 2022-05-07 │   2021 │ Harrogate Town A.F.C. │ Sutton United       │               0 │               2 │
11. │ 2022-05-07 │   2021 │ Hartlepool United     │ Colchester United   │               0 │               2 │
12. │ 2022-05-07 │   2021 │ Leyton Orient         │ Tranmere Rovers     │               0 │               1 │
13. │ 2022-05-07 │   2021 │ Mansfield Town        │ Forest Green Rovers │               2 │               2 │
14. │ 2022-05-07 │   2021 │ Newport County        │ Rochdale            │               0 │               2 │
15. │ 2022-05-07 │   2021 │ Oldham Athletic       │ Crawley Town        │               3 │               3 │
16. │ 2022-05-07 │   2021 │ Stevenage Borough     │ Salford City        │               4 │               2 │
17. │ 2022-05-07 │   2021 │ Walsall               │ Swindon Town        │               0 │               3 │
    └────────────┴────────┴───────────────────────┴─────────────────────┴─────────────────┴─────────────────┘

Insert the data:

sql
INSERT INTO football FROM INFILE 'football.bson' FORMAT BSONEachRow;

Reading data {#reading-data}

Read data using the BSONEachRow format:

sql
SELECT *
FROM football INTO OUTFILE 'docs_data/bson/football.bson'
FORMAT BSONEachRow

:::tip BSON is a binary format that does not display in a human-readable form on the terminal. Use the INTO OUTFILE to output BSON files. :::

Format settings {#format-settings}

SettingDescriptionDefault
output_format_bson_string_as_stringUse BSON String type instead of Binary for String columns.false
input_format_bson_skip_fields_with_unsupported_types_in_schema_inferenceAllow skipping columns with unsupported types while schema inference for format BSONEachRow.false