dev-docs/roadmaps/gpu-table-roadmap.md
The GPUTable class is a new addition to deck.gl v8 that is envisioned to be represent a central theme for the evolution of the framework
A primary design goal of the GPUTable class was to support direct transfer of Apache Arrow formatted tables to the GPU.
Arrow support can be built gradually over the course of deck.gl 8.x releases:
Some more advanced additions
TBD:
The GPUTable class should make it easy to do GPGPU programming. An important tool in the GPGPU programmer's arsenal is to be able to use data textures for random access. Thus, making selected columns available as readable (or writeable) data textures should be valuable, e.g:
gpuTable.getAttributes()gpuTable.getTextures(...columnNames)?GPUTable -> GPUTable transformCurrently, we have a set of aggregating layers in deck.gl that are somewhat ad-hoc in terms of each layer having special props and methods to handle the aggregations. Each layer defines props and shaders for their aggregations, and this means that a learning curve can be required for a new programmer to start modifying these layers.
If we could represent the aggregations at the core of these layers as GPUTable -> GPUTable operations, then we might be able to:
Columnar Tables can be treated like data frames (e.g. Python Pandas). One approach is to enable GPUTables to work as efficient "GPU-side shadows" of CPU-side Arrow DataFrames, mirroring the arrow table after CPU data frame operations:
import {ArrowLoader} from `@loaders.gl/arrow`;
const table = parse(..., ArrowLoader, {});
const gpuTable = new GPUTable(table);
table = table.slice(1000, 3000); // can drop chunks
gpuTable.setTable(table);
table = table.removeColumn('price'); // drops chunks
gpuTable.setTable(table);
Depending on what operations are performed on CPU, this might benefit from a more advanced GPU memory management system (e.g. data frame operations may drop only certain chunks, so reuploading all data might not be needed).
The GPUTable class could also support a set of "data frame" type methods and these could be implemented directly on GPU.
table.slice() can be supported by controlling first and last index of draw callGPUTables are perfect sources for TransformFeedback operations. New tables (new columns) can be generated from existing tables by composing or processing existing columns.
const gpuTable = GPUTransform(gpuTable, {
columns: {
'positions': {
type: GL.DOUBLE, // Creates a `Float64` array (i.e. two interleaved Float32Arrays with high-low values)
size: 3, // 3 values (6 Float32s) per element.
operation: 'interleave',
components: ['longitude', 'latitude', 'height'] // Takes components from these columns and interleaves them
},
// Remove the input columns
'longitude': 'drop',
'latitude': 'drop',
'height': 'drop'
},
// Drop the input columns
columnFilter:
excludeColumns: ['longitude', 'latitude', 'height']
})
GPU attributes are to some degrees flexible and accept a range of data types (meaning that often no transformation may be necessary), and the getAttributes() method of the GPUTable sets the required flags (e.g. ) it is sometimes valuable to transform data types to achieve better memory characteristics.
Uint8 rather than Float32s.Uint8 - they could even be oct-encoded on the fly though this seems like overkill.const gpuTable = GPUTransform(gpuTable, {
columns: {
'positions': {
type: GL.BYTE,
size: 4,
normalized
'colors'
}
}
})
Perhaps a library of support functions, based on transform feedback, could provide such conversions.
Preparation of vec2 , vec3, vec4s from individual columns into an interleaved column.
Interleaving of all attributes could be handled internally by the GPUTable class, which means the GPUTable would just manage a single GPU Buffer.
Interleaving means that more cache locality and anecdotally, this leads to better performance, although the improvement seem to be fairly small (5%) and also not always seen on modern GPUs.
Interleaving of all attributes does make the table harder to work with, as adding columns requires regenerating the entire buffer, although it should be possible to do this using transform feedback.
Columns can also be variable length:
When working with variable length columns:
TBD:
There are potential issues with how arrow encodes dictionary columns in record batches that may make it impractical to create dictionary columns before all
RecordBatcheshave arrived.
Enumerations are
When dealing with variable length columns, tesselation is often needed before rendering.
For instance, a variable length binary column can represent polygons to either be tesselated as outlines or perhaps as extruded blocks.
Tesselation generally happens on the CPU.
Possible conventions for storing polygons:
Tesselation generates one or more variable size columns.
The GPUAttributes class specifies information about the attributes that are used by (i.e. defined in) a shader. It can be automatically extraced during shader compilation or generated programmatically.
new GPUAttributes({
instancePositions: {
size: 4,
type: GL.DOUBLE,
accessor: ['getSourcePosition', 'getTargetPosition'],
update: this.calculateInstancePositions
},
instanceSourceColors: {
size: this.props.colorFormat.length,
type: GL.UNSIGNED_BYTE,
normalized: true,
accessor: 'getSourceColor',
defaultValue: DEFAULT_COLOR
},
instanceTargetColors: {
size: this.props.colorFormat.length,
type: GL.UNSIGNED_BYTE,
normalized: true,
accessor: 'getTargetColor',
defaultValue: DEFAULT_COLOR
},
instanceWidths: {
size: 1,
accessor: 'getWidth',
defaultValue: 1
},
instanceHeights: {
size: 1,
accessor: 'getHeight',
defaultValue: 1
},
instanceTilts: {
size: 1,
accessor: 'getTilt',
defaultValue: 0
}
});
GPUTables can be filtered. Standard arrow predicates are interpreted and transpiled into GLSL filter functions. Filtering can be done by adjusting the start-end indices for rendering, and by discarding fragments from filtered rows
gpuTable.getFilterModule(): Returns a shadertools GLSL module that implements the filter set by the current predicates.module.vs - the vertex shader block
module.fs - the fragment shader block
module.getUniforms(options) - generates uniforms from column values
module.uniforms - a list of uniform values in the query (can e.g. be used to autopopulate a UI with sliders etc)
On GLSL versions that do not support double and dvec* data types, double data types are split into two sets of attributes, one with the high 32 bit float and one with the suffix _64low.
Note that the splitting of Float64 into two Float32 values is done on the CPU as the attribute is uploaded to the GPU. The two values for each float are interleaved in memory are exposed as two separate attributes using stride.
A GPUTable will create contiguous GPU memory for its chunks.
On initial creation, this just involves calculating the total length of all the chunks in each column, and uploading these.
Apache Arrow comes with built-in support for streaming in Apache Arrow data, and loaders.gl is providing a growing set of streaming table loaders that convert textual tabular formats to binary columnar / Apache Arrow formats on the fly.
In both cases, ES2018 async iterators are returned that yield successive batches of rows from the table, containing equal-length chunks of the table's columns.
A GPU Table can efficiently accommodate updates resulting from a table growing through additional batches. Buffers get "topped off" (possibly being reinitialized).
TBD
Buffer instances when reallocating (which would require apps to rebind buffers), or is it better to keep the same buffers and reinitialize them internally (requiring two CPU copies, first into temporary buffer and then back into the original, resized buffer).Columnar Tables can be treated like data frames (see Python Pandas), e.g. using Apache Arrow DataFrames, where operations can be done very efficiently on the chunked memory by just dropping or adding chunks.
To support this, a function gpuTable.setTable(table) could update the GPU buffers associated with the table.
import {ArrowLoader} from `@loaders.gl/arrow`;
const table = parse(..., ArrowLoader, {});
const gpuTable = new GPUTable(table);
table = table.slice(1000, 3000); // can drop chunks
gpuTable.setTable(table);
table = table.removeColumn('price'); // drops chunks
gpuTable.setTable(table);
Since tables are chunked, and many data frame operations just drop or add a subset of chunks, only a subset of memory would need to be re-uploaded to the GPU.
Assuming we want to keep each column in a single buffer, so that we can process the table in a single draw call, this would require a more sophisticated memory management system on the GPU which can handle holes can be added as a refinement based on the strength of the JS data frame use case and the performance impact of reuploading the full table after each operation.
However: change detection to assess if anything needs to be uploaded should be implemented in the first version
TBD
DRAW_ELEMENTS extension to draw subsets.