dev-docs/RFCs/v7.2/data-loading-rfc.md
This RFC suggests a number of improvements to loading support in deck.gl, in particular to support seamless optional integration with loaders.gl.
The vision of loaders.gl is that loaders are designed to be directly compatible with luma.gl and deck.gl and easy/intuitive to use is not fully realized in deck.gl v7.0.
fetch call.No good way to specify different loaders for different props.
The ability for app to redefine how data is loaded (or "fetched") is important.
Apps need to be able to do things like
There are many different techniques and libraries (XMLHttpRequest, company-internal libraries etc) that help users load data into the browser, we want to enable users to use the techniques that work for them.
It is desirable to be able to separate override fetching and loading, to support:
To achieve this, we redefine (extend) the semantics of the current fetch prop:
props.fetch returns a String ArrayBuffer or Response (or a promise resolving to these), we apply parsing to the result.props.fetch returns e.g. an Array or Object deck.gl can assume that fetch did perform parsing and skip aNote that the fetch prop remains backwards compatible by only considering fetch separated from parsing if it returns structured data.
Simple example that uses fetch with options
SomeLayer({
data: INTERNAL_DATA_URL,
fetch: url => fetch(url, {headers: {'Company-Access-Token': 'Secret-Value'}})
})
Default fetch:
fetch...Design Notes:
fetch was the current (7.1) fetch prop semantics (which are seeing modest use in our own code). Per the current semantics, a props.fetch override is expected to both fetch and parse.We could support a fetch overload that just takes an object with parameters to fetch (which would presumably be the most common use case for overriding fetch):
SomeLayer({
data: INTERNAL_DATA_URL,
fetch: {headers: {'Company-Access-Token': 'Secret-Value'}}
})
Design notes:
parse prop to specify how data should be parsedimport {parse, registerLoaders} from '@loaders.gl/core';
import {CSVLoader} from '@loaders.gl/csv';
registerLoaders(CSVLoader);
new AnyLayer({
coordinateSystem: COORDINATE_SYSTEM.IDENTITY,
data: CSV_URL,
// NEW: Accept a parse methods
parse,
});
There are two ways to specify a custom loader that works with the deck.gl async prop loading:
Do we expect parse functions to support different input data.
The parse prop is expected to be either a function or a (list of) loaders.gl loaders that can be passed to parse.
Design Notes:
parse is flexible and accepts a fetch response object as its preferred (most flexiable/efficient input). It is fair to assume that not all custom loaders support a variety of objects. By registering a custom loadersBy supplying an object to fetch, different request options could be needed for different props (different resources may be served from different servers and they may need different headers etc):
SomeBitmapLayer({
data: DATA_URL,
bitmap: BITMAP_URL,
fetch: {
data: url => fetch(url, {headers: {...}})
bitmap: url => fetch(url, {headers: {...}})
}
})
If an async prop isn't listed it will be fetched using the default fetch method.
SomeBitmapLayer({
data: DATA_URL, // loaded with custom fetch
bitmap: BITMAP_URL, // loaded with default fetch
fetch: {
data: url => fetch(url, {headers: {...}}),
}
})
Redefining default AND specific fetch
SomeBitmapLayer({
data: DATA_URL, // loaded with custom fetch
bitmap: BITMAP_URL, // loaded with default fetch
fetch: {
data: url => fetch(url, {headers: {...}}),
}
})
Design notes:
fetch overrides (proposal 1b), as both represent Object overloads.Alternative design, we could let async props be objects. But this would of course be an issue when we accept objects (like binary data).
SomeBitmapLayer({
data: {
url: DATA_URL,
fetch: url => fetch(url, {headers: {...}}),
parse: ...
},
bitmap: BITMAP_URL,
})
SomeBitmapLayer({
data: DATA_URL, // loaded with custom fetch
bitmap: BITMAP_URL, // loaded with default fetch
fetch: {
data: url => fetch(url, {headers: {...}}),
}
})
Sometimes, a loader called by the parse function needs contextual information. For instance, the ScenegraphLoader cannot work without a gl context being passed in.
Design Notes:
The new onData callback is called whenever the layer sees new data, either as a result of an async load completing, or just as a result of new sync data being supplied to the layer.
This allows app to use the convenience of async data URL props, even when they need to do some final processing, perhaps extract some small piece of information from the data to update the view state.
For instance, the PointCloudLayer example uses the bounding box of the point cloud to initialize viewState for the OrbitController.
BEFORE:
const data = await fetch(LAZ_SAMPLE);
_setViewStateFromPointCloud(data)); // Calculate View State from header
new PointCloudLayer({
data
});
AFTER:
new PointCloudLayer({
data: LAZ_SAMPLE,
// NEW: Provide a callback to let application react to loaded/parsed/transformed data
onData: data => this._setViewStateFromPointCloud(data)
});
Design Notes:
onData callback should be called only once during batched streaming (when all batches have arrived). The batched loading RFC might add an onDataBatch callback that is called after each batch.data prop only? This callback was added to let apps react to the completed loading of async data props. Like other affordances for async props, one can imagine that similar callbacks could be useful for other async props...onDataLoaded. Reviewers pointed out that it would be useful to call this function should be called even when pre-parsed data is supplied. The new name (onData) reflects this change. Other options: onDataAvailable, onDataUpdated.Today custom code is required to extract the positions attribute etc and pass in as top-level props, see the PointCloudLayer example.
BEFORE:
fetch(LAZ_SAMPLE).then(...); // Extract numInstances and positions, calculate View State
new PointCloudLayer({
coordinateSystem: COORDINATE_SYSTEM.IDENTITY,
numInstances: state.pointsCount,
instancePositions: state.points
});
AFTER:
import {parse, registerLoaders} from '@loaders.gl/core';
import {LAZLoader} from '@loaders.gl/las';
registerLoaders(LAZLoader);
new PointCloudLayer({
coordinateSystem: COORDINATE_SYSTEM.IDENTITY,
data: LAZ_SAMPLE,
parse // NEW: See above
});
Instead of importing d3-csv etc.
These ideas still need some work to become formal proposals
Instead of having to pass fetch and parse to each layer, maybe just allow the user to set defaults on Deck.
This is a proposal for loaders.gl, but mentioned here for completeness:
Many apps could be slightly more elegant if loaders.gl loader modules auto-registered their loaders. With the right deck.gl integration, simply importing a loader module would make that loader available to the async props of deck.gl layers.
Comparing the point cloud example above:
import {LAZLoader} from '@loaders.gl/las';
new PointCloudLayer({
coordinateSystem: COORDINATE_SYSTEM.IDENTITY,
data: LAZ_SAMPLE
});
While such "pre-registration" would be trivial to implemented, there are some concerns:
@loaders.gl/loader-utils). If the loader modules have to import registerLoaders that changes. This is a design simplicity/elegance in loaders.gl that matters to some people, that would be lost for this convenience.@loaders.gl/loader-utils in each loader module, we could just ask each loader to push their loader a global array. But even then, the global scope must be determined, normally by helper function in loaders.gl.Assuming we make it easy to write custom loaders, there does not seem to be a strong use case for layers providing different defaults for fetch.
const defaultProps = {
bitmap: {async: true},
fetch: {
data: url => d3.csv(url)
}
}
Design Notes: