docs/en/sql-reference/statements/create/dictionary/layouts/range-hashed.md
import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';
The dictionary is stored in memory in the form of a hash table with an ordered array of ranges and their corresponding values.
This storage method works the same way as hashed and allows using date/time (arbitrary numeric type) ranges in addition to the key.
Example: The table contains discounts for each advertiser in the format:
┌─advertiser_id─┬─discount_start_date─┬─discount_end_date─┬─amount─┐
│ 123 │ 2015-01-16 │ 2015-01-31 │ 0.25 │
│ 123 │ 2015-01-01 │ 2015-01-15 │ 0.15 │
│ 456 │ 2015-01-01 │ 2015-01-15 │ 0.05 │
└───────────────┴─────────────────────┴───────────────────┴────────┘
To use a sample for date ranges, define the range_min and range_max elements in the structure. These elements must contain elements name and type (if type is not specified, the default type will be used - Date). type can be any numeric type (Date / DateTime / UInt64 / Int32 / others).
:::note
Values of range_min and range_max should fit in Int64 type.
:::
Example:
<Tabs> <TabItem value="ddl" label="DDL" default>CREATE DICTIONARY discounts_dict (
advertiser_id UInt64,
discount_start_date Date,
discount_end_date Date,
amount Float64
)
PRIMARY KEY id
SOURCE(CLICKHOUSE(TABLE 'discounts'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(RANGE_HASHED(range_lookup_strategy 'max'))
RANGE(MIN discount_start_date MAX discount_end_date)
<layout>
<range_hashed>
<!-- Strategy for overlapping ranges (min/max). Default: min (return a matching range with the min(range_min -> range_max) value) -->
<range_lookup_strategy>min</range_lookup_strategy>
</range_hashed>
</layout>
<structure>
<id>
<name>advertiser_id</name>
</id>
<range_min>
<name>discount_start_date</name>
<type>Date</type>
</range_min>
<range_max>
<name>discount_end_date</name>
<type>Date</type>
</range_max>
...
To work with these dictionaries, you need to pass an additional argument to the dictGet function, for which a range is selected:
dictGet('dict_name', 'attr_name', id, date)
Query example:
SELECT dictGet('discounts_dict', 'amount', 1, '2022-10-20'::Date);
This function returns the value for the specified ids and the date range that includes the passed date.
Details of the algorithm:
id is not found or a range is not found for the id, it returns the default value of the attribute's type.range_lookup_strategy=min, it returns a matching range with minimal range_min, if several ranges found, it returns a range with minimal range_max, if again several ranges found (several ranges had the same range_min and range_max it returns a random range of them.range_lookup_strategy=max, it returns a matching range with maximal range_min, if several ranges found, it returns a range with maximal range_max, if again several ranges found (several ranges had the same range_min and range_max it returns a random range of them.range_max is NULL, the range is open. NULL is treated as maximal possible value. For the range_min 1970-01-01 or 0 (-MAX_INT) can be used as the open value.Configuration example:
<Tabs> <TabItem value="ddl" label="DDL" default>CREATE DICTIONARY somedict(
Abcdef UInt64,
StartTimeStamp UInt64,
EndTimeStamp UInt64,
XXXType String DEFAULT ''
)
PRIMARY KEY Abcdef
RANGE(MIN StartTimeStamp MAX EndTimeStamp)
<clickhouse>
<dictionary>
...
<layout>
<range_hashed />
</layout>
<structure>
<id>
<name>Abcdef</name>
</id>
<range_min>
<name>StartTimeStamp</name>
<type>UInt64</type>
</range_min>
<range_max>
<name>EndTimeStamp</name>
<type>UInt64</type>
</range_max>
<attribute>
<name>XXXType</name>
<type>String</type>
<null_value />
</attribute>
</structure>
</dictionary>
</clickhouse>
Configuration example with overlapping ranges and open ranges:
CREATE TABLE discounts
(
advertiser_id UInt64,
discount_start_date Date,
discount_end_date Nullable(Date),
amount Float64
)
ENGINE = Memory;
INSERT INTO discounts VALUES (1, '2015-01-01', Null, 0.1);
INSERT INTO discounts VALUES (1, '2015-01-15', Null, 0.2);
INSERT INTO discounts VALUES (2, '2015-01-01', '2015-01-15', 0.3);
INSERT INTO discounts VALUES (2, '2015-01-04', '2015-01-10', 0.4);
INSERT INTO discounts VALUES (3, '1970-01-01', '2015-01-15', 0.5);
INSERT INTO discounts VALUES (3, '1970-01-01', '2015-01-10', 0.6);
SELECT * FROM discounts ORDER BY advertiser_id, discount_start_date;
┌─advertiser_id─┬─discount_start_date─┬─discount_end_date─┬─amount─┐
│ 1 │ 2015-01-01 │ ᴺᵁᴸᴸ │ 0.1 │
│ 1 │ 2015-01-15 │ ᴺᵁᴸᴸ │ 0.2 │
│ 2 │ 2015-01-01 │ 2015-01-15 │ 0.3 │
│ 2 │ 2015-01-04 │ 2015-01-10 │ 0.4 │
│ 3 │ 1970-01-01 │ 2015-01-15 │ 0.5 │
│ 3 │ 1970-01-01 │ 2015-01-10 │ 0.6 │
└───────────────┴─────────────────────┴───────────────────┴────────┘
-- RANGE_LOOKUP_STRATEGY 'max'
CREATE DICTIONARY discounts_dict
(
advertiser_id UInt64,
discount_start_date Date,
discount_end_date Nullable(Date),
amount Float64
)
PRIMARY KEY advertiser_id
SOURCE(CLICKHOUSE(TABLE discounts))
LIFETIME(MIN 600 MAX 900)
LAYOUT(RANGE_HASHED(RANGE_LOOKUP_STRATEGY 'max'))
RANGE(MIN discount_start_date MAX discount_end_date);
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-14')) res;
┌─res─┐
│ 0.1 │ -- the only one range is matching: 2015-01-01 - Null
└─────┘
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-16')) res;
┌─res─┐
│ 0.2 │ -- two ranges are matching, range_min 2015-01-15 (0.2) is bigger than 2015-01-01 (0.1)
└─────┘
select dictGet('discounts_dict', 'amount', 2, toDate('2015-01-06')) res;
┌─res─┐
│ 0.4 │ -- two ranges are matching, range_min 2015-01-04 (0.4) is bigger than 2015-01-01 (0.3)
└─────┘
select dictGet('discounts_dict', 'amount', 3, toDate('2015-01-01')) res;
┌─res─┐
│ 0.5 │ -- two ranges are matching, range_min are equal, 2015-01-15 (0.5) is bigger than 2015-01-10 (0.6)
└─────┘
DROP DICTIONARY discounts_dict;
-- RANGE_LOOKUP_STRATEGY 'min'
CREATE DICTIONARY discounts_dict
(
advertiser_id UInt64,
discount_start_date Date,
discount_end_date Nullable(Date),
amount Float64
)
PRIMARY KEY advertiser_id
SOURCE(CLICKHOUSE(TABLE discounts))
LIFETIME(MIN 600 MAX 900)
LAYOUT(RANGE_HASHED(RANGE_LOOKUP_STRATEGY 'min'))
RANGE(MIN discount_start_date MAX discount_end_date);
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-14')) res;
┌─res─┐
│ 0.1 │ -- the only one range is matching: 2015-01-01 - Null
└─────┘
select dictGet('discounts_dict', 'amount', 1, toDate('2015-01-16')) res;
┌─res─┐
│ 0.1 │ -- two ranges are matching, range_min 2015-01-01 (0.1) is less than 2015-01-15 (0.2)
└─────┘
select dictGet('discounts_dict', 'amount', 2, toDate('2015-01-06')) res;
┌─res─┐
│ 0.3 │ -- two ranges are matching, range_min 2015-01-01 (0.3) is less than 2015-01-04 (0.4)
└─────┘
select dictGet('discounts_dict', 'amount', 3, toDate('2015-01-01')) res;
┌─res─┐
│ 0.6 │ -- two ranges are matching, range_min are equal, 2015-01-10 (0.6) is less than 2015-01-15 (0.5)
└─────┘
The dictionary is stored in memory in the form of a hash table with an ordered array of ranges and their corresponding values (see range_hashed). This type of storage is for use with composite keys.
Configuration example:
CREATE DICTIONARY range_dictionary
(
CountryID UInt64,
CountryKey String,
StartDate Date,
EndDate Date,
Tax Float64 DEFAULT 0.2
)
PRIMARY KEY CountryID, CountryKey
SOURCE(CLICKHOUSE(TABLE 'date_table'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(COMPLEX_KEY_RANGE_HASHED())
RANGE(MIN StartDate MAX EndDate);