crates/polars-parquet/src/arrow/read/README.md
When the maximum repetition level is 0 and the maximum definition level is 1, the RLE-encoded definition levels correspond exactly to Arrow's bitmap and can be memcopied without further transformations.
Reading a parquet nested field is done by reading each primitive column sequentially, and build the nested struct recursively.
Rows of nested parquet groups are encoded in the repetition and definition levels. In arrow, they correspond to:
The implementation in this module leverages this observation:
Nested parquet fields are initially recursed over to gather whether the type is a Struct or List,
and whether it is required or optional, which we store in nested_info: Vec<Box<dyn Nested>>.
Nested is a trait object that receives definition and repetition levels depending on the type and
nullability of the nested item. We process the definition and repetition levels into nested_info.
When we finish a field, we recursively pop from nested_info as we build the StructArray or
ListArray.
With this approach, the only difference vs flat is:
i32.nested_info.