Aggregator Interface

The Aggregator interface describes a type of class that performs aggregation.

Terminology

Aggregation is a 2-step process:

Sort: Group a collection of data points by some property into bins.
Aggregate: for each bin, calculate a numeric output (result) from some metrics (values) from all its members. Multiple results can be obtained independently (channels).

An implementation of the Aggregator interface takes the following inputs:

The number of data points
The group that each data point belongs to, by mapping each data point to a binId (array of integers)
The values to aggregate, by mapping each data point in each channel to one value (number)
The method (operation) to reduce a list of values to one number, such as SUM

And yields the following outputs:

A list of binIds that data points get sorted into
The aggregated values (result) as a list of numbers, comprised of one number per bin per channel
The [min, max] among all aggregated values (domain) for each channel

Example

Consider the task of making a histogram that shows the result of a survey by age distribution.

The data points are the list of participants, and we know the age of each person.
Suppose we want to group them by 5-year intervals. A 21-year-old participant is assigned to the bin of age 20-25, with binId [20]. A 35-year-old participant is assigned to the bin of age 35-40, with binId [35], and so on.
For each bin (i.e. age group), we calculate 2 values:
- The first channel is "number of participants". Each participant in this group yields a value of 1, and the result equals all values added together (operation: SUM).
- The second channel is "average score". Each participant in this group yields a value that is their test score, and the result equals the sum of all scores divided by the number of participants (operation: MEAN).
As the outcome of the aggregation, we have:
- Bins: [15, 20, 25, 30, 35, 40]
- Channel 0 result: [1, 5, 12, 10, 8, 3]
- Channel 0 domain: [1, 12]
- Channel 1 result: [6, 8.2, 8.5, 7.9, 7.75, 8]
- Channel 1 domain: [6, 8.5]

Methods

An implementation of Aggregator should expose the following methods:

`setProps` {#setprops}

Set runtime properties of the aggregation.

aggregator.setProps({
  pointCount: 10000,
  attributes: {...},
  operations: ['SUM', 'MEAN'],
  binOptions: {groupSize: 5}
});

Arguments:

pointCount (number) - number of data points.
attributes (Attribute[]) - the input data.
operations (string[]) - How to aggregate the values inside a bin, defined per channel.
binOptions (object) - arbitrary settings that affect bin sorting.
onUpdate (Function) - callback when a channel has been recalculated. Receives the following arguments:
- channel (number) - the channel that just updated

`setNeedsUpdate` {#setneedsupdate}

Flags a channel to need update. This could be a result of change in the input data or bin options.

aggregator.setNeedsUpdate(0);

Arguments:

channel (number, optional) - mark the given channel as dirty. If not provided, all channels will be updated.

`update` {#update}

Called after all props are set and before results are accessed. The aggregator should allocate resources and redo aggregations if needed at this stage.

aggregator.update();

`preDraw` {#predraw}

Called before the result buffers are drawn to screen. Certain types of aggregations are dependent on render time context and this is alternative opportunity to update just-in-time.

aggregator.preDraw();

`getBin` {#getbin}

Get the information of a given bin.

const bin = aggregator.getBin(100);

Arguments:

index (number) - index of the bin to locate it in getBins()

Returns:

id (number[]) - Unique bin ID.
value (number[]) - Aggregated values by channel.
count (number) - Number of data points in this bin.
pointIndices (number[] | undefined) - Indices of data points in this bin if available. This field may not be populated when using GPU-based implementations.

`getBins` {#getbins}

Get an accessor to all bin IDs.

const binIdsAttribute = aggregator.getBins();

Returns:

A binary attribute of the output bin IDs, or
null, if update has never been called

`getResult` {#getresult}

Get an accessor to the aggregated values of a given channel.

const resultAttribute = aggregator.getResult(0);

Arguments:

channel (number) - the channel to retrieve results from

Returns:

A binary attribute of the output values of the given channel, or
null, if update has never been called

`getResultDomain` {#getresultdomain}

Get the [min, max] of aggregated values of a given channel.

const [min, max] = aggregator.getResultDomain(0);

Arguments:

channel (number) - the channel to retrieve results from

Returns the domain ([number, number]) of the aggregated values of the given channel.

`destroy` {#destroy}

Dispose all allocated resources.

aggregator.destroy();

Members

An implementation of Aggregator should expose the following members:

`binCount` (number) {#bincount}

The number of bins in the aggregated result.

Source

modules/aggregation-layers/src/common/aggregator/aggregator.ts

Aggregator Interface

Aggregator Interface

Terminology

Example

Methods

setProps {#setprops}

setNeedsUpdate {#setneedsupdate}

update {#update}

preDraw {#predraw}

getBin {#getbin}

getBins {#getbins}

getResult {#getresult}

getResultDomain {#getresultdomain}

destroy {#destroy}

Members

binCount (number) {#bincount}

Source

`setProps` {#setprops}

`setNeedsUpdate` {#setneedsupdate}

`update` {#update}

`preDraw` {#predraw}

`getBin` {#getbin}

`getBins` {#getbins}

`getResult` {#getresult}

`getResultDomain` {#getresultdomain}

`destroy` {#destroy}

`binCount` (number) {#bincount}