Back to Datafusion

Scalar Functions

docs/source/user-guide/sql/scalar_functions.md

53.1.0140.8 KB
Original Source
<!--- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <!--- This file was generated by the dev/update_function_docs.sh script. Do not edit it manually as changes will be overwritten. Instead, edit the ScalarUDFImpl's documentation() function to update documentation for an individual UDF or the dev/update_function_docs.sh file for updating surrounding text. -->

Scalar Functions

Math Functions

abs

Returns the absolute value of a number.

sql
abs(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT abs(-5);
+----------+
| abs(-5)  |
+----------+
| 5        |
+----------+

acos

Returns the arc cosine or inverse cosine of a number.

sql
acos(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT acos(1);
+----------+
| acos(1)  |
+----------+
| 0.0      |
+----------+

acosh

Returns the area hyperbolic cosine or inverse hyperbolic cosine of a number.

sql
acosh(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT acosh(2);
+------------+
| acosh(2)   |
+------------+
| 1.31696    |
+------------+

asin

Returns the arc sine or inverse sine of a number.

sql
asin(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT asin(0.5);
+------------+
| asin(0.5)  |
+------------+
| 0.5235988  |
+------------+

asinh

Returns the area hyperbolic sine or inverse hyperbolic sine of a number.

sql
asinh(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT asinh(1);
+------------+
| asinh(1)   |
+------------+
| 0.8813736  |
+------------+

atan

Returns the arc tangent or inverse tangent of a number.

sql
atan(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
    > SELECT atan(1);
+-----------+
| atan(1)   |
+-----------+
| 0.7853982 |
+-----------+

atan2

Returns the arc tangent or inverse tangent of expression_y / expression_x.

sql
atan2(expression_y, expression_x)

Arguments

  • expression_y: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
  • expression_x: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.

Example

sql
> SELECT atan2(1, 1);
+------------+
| atan2(1,1) |
+------------+
| 0.7853982  |
+------------+

atanh

Returns the area hyperbolic tangent or inverse hyperbolic tangent of a number.

sql
atanh(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
    > SELECT atanh(0.5);
+-------------+
| atanh(0.5)  |
+-------------+
| 0.5493061   |
+-------------+

cbrt

Returns the cube root of a number.

sql
cbrt(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT cbrt(27);
+-----------+
| cbrt(27)  |
+-----------+
| 3.0       |
+-----------+

ceil

Returns the nearest integer greater than or equal to a number.

sql
ceil(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT ceil(3.14);
+------------+
| ceil(3.14) |
+------------+
| 4.0        |
+------------+

cos

Returns the cosine of a number.

sql
cos(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT cos(0);
+--------+
| cos(0) |
+--------+
| 1.0    |
+--------+

cosh

Returns the hyperbolic cosine of a number.

sql
cosh(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT cosh(1);
+-----------+
| cosh(1)   |
+-----------+
| 1.5430806 |
+-----------+

cot

Returns the cotangent of a number.

sql
cot(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT cot(1);
+---------+
| cot(1)  |
+---------+
| 0.64209 |
+---------+

degrees

Converts radians to degrees.

sql
degrees(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
    > SELECT degrees(pi());
+------------+
| degrees(0) |
+------------+
| 180.0      |
+------------+

exp

Returns the base-e exponential of a number.

sql
exp(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT exp(1);
+---------+
| exp(1)  |
+---------+
| 2.71828 |
+---------+

factorial

Factorial. Returns 1 if value is less than 2.

sql
factorial(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT factorial(5);
+---------------+
| factorial(5)  |
+---------------+
| 120           |
+---------------+

floor

Returns the nearest integer less than or equal to a number.

sql
floor(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT floor(3.14);
+-------------+
| floor(3.14) |
+-------------+
| 3.0         |
+-------------+

gcd

Returns the greatest common divisor of expression_x and expression_y. Returns 0 if both inputs are zero.

sql
gcd(expression_x, expression_y)

Arguments

  • expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT gcd(48, 18);
+------------+
| gcd(48,18) |
+------------+
| 6          |
+------------+

isnan

Returns true if a given number is +NaN or -NaN otherwise returns false.

sql
isnan(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT isnan(1);
+----------+
| isnan(1) |
+----------+
| false    |
+----------+

iszero

Returns true if a given number is +0.0 or -0.0 otherwise returns false.

sql
iszero(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT iszero(0);
+------------+
| iszero(0)  |
+------------+
| true       |
+------------+

lcm

Returns the least common multiple of expression_x and expression_y. Returns 0 if either input is zero.

sql
lcm(expression_x, expression_y)

Arguments

  • expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT lcm(4, 5);
+----------+
| lcm(4,5) |
+----------+
| 20       |
+----------+

ln

Returns the natural logarithm of a number.

sql
ln(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT ln(2.71828);
+-------------+
| ln(2.71828) |
+-------------+
| 1.0         |
+-------------+

log

Returns the base-x logarithm of a number. Can either provide a specified base, or if omitted then takes the base-10 of a number.

sql
log(base, numeric_expression)
log(numeric_expression)

Arguments

  • base: Base numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT log(10);
+---------+
| log(10) |
+---------+
| 1.0     |
+---------+

log10

Returns the base-10 logarithm of a number.

sql
log10(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT log10(100);
+-------------+
| log10(100)  |
+-------------+
| 2.0         |
+-------------+

log2

Returns the base-2 logarithm of a number.

sql
log2(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT log2(8);
+-----------+
| log2(8)   |
+-----------+
| 3.0       |
+-----------+

nanvl

Returns the first argument if it's not NaN. Returns the second argument otherwise.

sql
nanvl(expression_x, expression_y)

Arguments

  • expression_x: Numeric expression to return if it's not NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
  • expression_y: Numeric expression to return if the first expression is NaN. Can be a constant, column, or function, and any combination of arithmetic operators.

Example

sql
> SELECT nanvl(0, 5);
+------------+
| nanvl(0,5) |
+------------+
| 0          |
+------------+

pi

Returns an approximate value of π.

sql
pi()

pow

Alias of power.

power

Returns a base expression raised to the power of an exponent.

sql
power(base, exponent)

Arguments

  • base: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • exponent: Exponent numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT power(2, 3);
+-------------+
| power(2,3)  |
+-------------+
| 8           |
+-------------+

Aliases

  • pow

radians

Converts degrees to radians.

sql
radians(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT radians(180);
+----------------+
| radians(180)   |
+----------------+
| 3.14159265359  |
+----------------+

random

Returns a random float value in the range [0, 1). The random seed is unique to each row.

sql
random()

Example

sql
> SELECT random();
+------------------+
| random()         |
+------------------+
| 0.7389238902938  |
+------------------+

round

Rounds a number to the nearest integer.

sql
round(numeric_expression[, decimal_places])

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • decimal_places: Optional. The number of decimal places to round to. Defaults to 0.

Example

sql
> SELECT round(3.14159);
+--------------+
| round(3.14159)|
+--------------+
| 3.0          |
+--------------+

signum

Returns the sign of a number. Negative numbers return -1. Zero and positive numbers return 1.

sql
signum(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT signum(-42);
+-------------+
| signum(-42) |
+-------------+
| -1          |
+-------------+

sin

Returns the sine of a number.

sql
sin(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT sin(0);
+----------+
| sin(0)   |
+----------+
| 0.0      |
+----------+

sinh

Returns the hyperbolic sine of a number.

sql
sinh(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT sinh(1);
+-----------+
| sinh(1)   |
+-----------+
| 1.1752012 |
+-----------+

sqrt

Returns the square root of a number.

sql
sqrt(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

tan

Returns the tangent of a number.

sql
tan(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> SELECT tan(pi()/4);
+--------------+
| tan(PI()/4)  |
+--------------+
| 1.0          |
+--------------+

tanh

Returns the hyperbolic tangent of a number.

sql
tanh(numeric_expression)

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
  > SELECT tanh(20);
  +----------+
  | tanh(20) |
  +----------+
  | 1.0      |
  +----------+

trunc

Truncates a number to a whole number or truncated to the specified decimal places.

sql
trunc(numeric_expression[, decimal_places])

Arguments

  • numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • decimal_places: Optional. The number of decimal places to truncate to. Defaults to 0 (truncate to a whole number). If decimal_places is a positive integer, truncates digits to the right of the decimal point. If decimal_places is a negative integer, replaces digits to the left of the decimal point with 0.

Example

sql
> SELECT trunc(42.738);
+----------------+
| trunc(42.738)  |
+----------------+
| 42             |
+----------------+

Conditional Functions

coalesce

Returns the first of its arguments that is not null. Returns null if all arguments are null. This function is often used to substitute a default value for null values.

sql
coalesce(expression1[, ..., expression_n])

Arguments

  • expression1, expression_n: Expression to use if previous expressions are null. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.

Example

sql
> select coalesce(null, null, 'datafusion');
+----------------------------------------+
| coalesce(NULL,NULL,Utf8("datafusion")) |
+----------------------------------------+
| datafusion                             |
+----------------------------------------+

greatest

Returns the greatest value in a list of expressions. Returns null if all expressions are null.

sql
greatest(expression1[, ..., expression_n])

Arguments

  • expression1, expression_n: Expressions to compare and return the greatest value.. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.

Example

sql
> select greatest(4, 7, 5);
+---------------------------+
| greatest(4,7,5)           |
+---------------------------+
| 7                         |
+---------------------------+

ifnull

Alias of nvl.

least

Returns the smallest value in a list of expressions. Returns null if all expressions are null.

sql
least(expression1[, ..., expression_n])

Arguments

  • expression1, expression_n: Expressions to compare and return the smallest value. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.

Example

sql
> select least(4, 7, 5);
+---------------------------+
| least(4,7,5)              |
+---------------------------+
| 4                         |
+---------------------------+

nullif

Returns null if expression1 equals expression2; otherwise it returns expression1. This can be used to perform the inverse operation of coalesce.

sql
nullif(expression1, expression2)

Arguments

  • expression1: Expression to compare and return if equal to expression2. Can be a constant, column, or function, and any combination of operators.
  • expression2: Expression to compare to expression1. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select nullif('datafusion', 'data');
+-----------------------------------------+
| nullif(Utf8("datafusion"),Utf8("data")) |
+-----------------------------------------+
| datafusion                              |
+-----------------------------------------+
> select nullif('datafusion', 'datafusion');
+-----------------------------------------------+
| nullif(Utf8("datafusion"),Utf8("datafusion")) |
+-----------------------------------------------+
|                                               |
+-----------------------------------------------+

nvl

Returns expression2 if expression1 is NULL otherwise it returns expression1 and expression2 is not evaluated. This function can be used to substitute a default value for NULL values.

sql
nvl(expression1, expression2)

Arguments

  • expression1: Expression to return if not null. Can be a constant, column, or function, and any combination of operators.
  • expression2: Expression to return if expr1 is null. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select nvl(null, 'a');
+---------------------+
| nvl(NULL,Utf8("a")) |
+---------------------+
| a                   |
+---------------------+\
> select nvl('b', 'a');
+--------------------------+
| nvl(Utf8("b"),Utf8("a")) |
+--------------------------+
| b                        |
+--------------------------+

Aliases

  • ifnull

nvl2

Returns expression2 if expression1 is not NULL; otherwise it returns expression3.

sql
nvl2(expression1, expression2, expression3)

Arguments

  • expression1: Expression to test for null. Can be a constant, column, or function, and any combination of operators.
  • expression2: Expression to return if expr1 is not null. Can be a constant, column, or function, and any combination of operators.
  • expression3: Expression to return if expr1 is null. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select nvl2(null, 'a', 'b');
+--------------------------------+
| nvl2(NULL,Utf8("a"),Utf8("b")) |
+--------------------------------+
| b                              |
+--------------------------------+
> select nvl2('data', 'a', 'b');
+----------------------------------------+
| nvl2(Utf8("data"),Utf8("a"),Utf8("b")) |
+----------------------------------------+
| a                                      |
+----------------------------------------+

String Functions

ascii

Returns the first Unicode scalar value of a string.

sql
ascii(str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select ascii('abc');
+--------------------+
| ascii(Utf8("abc")) |
+--------------------+
| 97                 |
+--------------------+
> select ascii('🚀');
+-------------------+
| ascii(Utf8("🚀")) |
+-------------------+
| 128640            |
+-------------------+

Related functions:

bit_length

Returns the bit length of a string.

sql
bit_length(str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select bit_length('datafusion');
+--------------------------------+
| bit_length(Utf8("datafusion")) |
+--------------------------------+
| 80                             |
+--------------------------------+

Related functions:

btrim

Trims the specified trim string from the start and end of a string. If no trim string is provided, all spaces are removed from the start and end of the input string.

sql
btrim(str[, trim_str])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • trim_str: String expression to operate on. Can be a constant, column, or function, and any combination of operators. Default is a space.

Example

sql
> select btrim('__datafusion____', '_');
+-------------------------------------------+
| btrim(Utf8("__datafusion____"),Utf8("_")) |
+-------------------------------------------+
| datafusion                                |
+-------------------------------------------+

Alternative Syntax

sql
trim(BOTH trim_str FROM str)
sql
trim(trim_str FROM str)

Aliases

  • trim

Related functions:

char_length

Alias of character_length.

character_length

Returns the number of characters in a string.

sql
character_length(str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select character_length('Ångström');
+------------------------------------+
| character_length(Utf8("Ångström")) |
+------------------------------------+
| 8                                  |
+------------------------------------+

Aliases

  • length
  • char_length

Related functions:

chr

Returns a string containing the character with the specified Unicode scalar value.

sql
chr(expression)

Arguments

  • expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select chr(128640);
+--------------------+
| chr(Int64(128640)) |
+--------------------+
| 🚀                 |
+--------------------+

Related functions:

concat

Concatenates multiple strings together.

sql
concat(str[, ..., str_n])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • str_n: Subsequent string expressions to concatenate.

Example

sql
> select concat('data', 'f', 'us', 'ion');
+-------------------------------------------------------+
| concat(Utf8("data"),Utf8("f"),Utf8("us"),Utf8("ion")) |
+-------------------------------------------------------+
| datafusion                                            |
+-------------------------------------------------------+

Related functions:

concat_ws

Concatenates multiple strings together with a specified separator.

sql
concat_ws(separator, str[, ..., str_n])

Arguments

  • separator: Separator to insert between concatenated strings.
  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • str_n: Subsequent string expressions to concatenate.

Example

sql
> select concat_ws('_', 'data', 'fusion');
+--------------------------------------------------+
| concat_ws(Utf8("_"),Utf8("data"),Utf8("fusion")) |
+--------------------------------------------------+
| data_fusion                                      |
+--------------------------------------------------+

Related functions:

contains

Return true if search_str is found within string (case-sensitive).

sql
contains(str, search_str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • search_str: The string to search for in str.

Example

sql
> select contains('the quick brown fox', 'row');
+---------------------------------------------------+
| contains(Utf8("the quick brown fox"),Utf8("row")) |
+---------------------------------------------------+
| true                                              |
+---------------------------------------------------+

ends_with

Tests if a string ends with a substring.

sql
ends_with(str, substr)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • substr: Substring to test for.

Example

sql
>  select ends_with('datafusion', 'soin');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("soin")) |
+--------------------------------------------+
| false                                      |
+--------------------------------------------+
> select ends_with('datafusion', 'sion');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("sion")) |
+--------------------------------------------+
| true                                       |
+--------------------------------------------+

find_in_set

Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings.

sql
find_in_set(str, strlist)

Arguments

  • str: String expression to find in strlist.
  • strlist: A string list is a string composed of substrings separated by , characters.

Example

sql
> select find_in_set('b', 'a,b,c,d');
+----------------------------------------+
| find_in_set(Utf8("b"),Utf8("a,b,c,d")) |
+----------------------------------------+
| 2                                      |
+----------------------------------------+

initcap

Capitalizes the first character in each word in the input string. Words are delimited by non-alphanumeric characters.

sql
initcap(str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select initcap('apache datafusion');
+------------------------------------+
| initcap(Utf8("apache datafusion")) |
+------------------------------------+
| Apache Datafusion                  |
+------------------------------------+

Related functions:

instr

Alias of strpos.

left

Returns a specified number of characters from the left side of a string.

sql
left(str, n)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • n: Number of characters to return.

Example

sql
> select left('datafusion', 4);
+-----------------------------------+
| left(Utf8("datafusion"),Int64(4)) |
+-----------------------------------+
| data                              |
+-----------------------------------+

Related functions:

length

Alias of character_length.

levenshtein

Returns the Levenshtein distance between the two given strings.

sql
levenshtein(str1, str2)

Arguments

  • str1: String expression to compute Levenshtein distance with str2.
  • str2: String expression to compute Levenshtein distance with str1.

Example

sql
> select levenshtein('kitten', 'sitting');
+---------------------------------------------+
| levenshtein(Utf8("kitten"),Utf8("sitting")) |
+---------------------------------------------+
| 3                                           |
+---------------------------------------------+

lower

Converts a string to lower-case.

sql
lower(str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select lower('Ångström');
+-------------------------+
| lower(Utf8("Ångström")) |
+-------------------------+
| ångström                |
+-------------------------+

Related functions:

lpad

Pads the left side of a string with another string to a specified string length.

sql
lpad(str, n[, padding_str])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • n: String length to pad to. If the input string is longer than this length, it is truncated (on the right).
  • padding_str: Optional string expression to pad with. Can be a constant, column, or function, and any combination of string operators. Default is a space.

Example

sql
> select lpad('Dolly', 10, 'hello');
+---------------------------------------------+
| lpad(Utf8("Dolly"),Int64(10),Utf8("hello")) |
+---------------------------------------------+
| helloDolly                                  |
+---------------------------------------------+

Related functions:

ltrim

Trims the specified trim string from the beginning of a string. If no trim string is provided, spaces are removed from the start of the input string.

sql
ltrim(str[, trim_str])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • trim_str: String expression to trim from the beginning of the input string. Can be a constant, column, or function, and any combination of arithmetic operators. Default is a space.

Example

sql
> select ltrim('  datafusion  ');
+-------------------------------+
| ltrim(Utf8("  datafusion  ")) |
+-------------------------------+
| datafusion                    |
+-------------------------------+
> select ltrim('___datafusion___', '_');
+-------------------------------------------+
| ltrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| datafusion___                             |
+-------------------------------------------+

Alternative Syntax

sql
trim(LEADING trim_str FROM str)

Related functions:

octet_length

Returns the length of a string in bytes.

sql
octet_length(str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select octet_length('Ångström');
+--------------------------------+
| octet_length(Utf8("Ångström")) |
+--------------------------------+
| 10                             |
+--------------------------------+

Related functions:

overlay

Returns the string which is replaced by another string from the specified position and specified count length.

sql
overlay(str PLACING substr FROM pos [FOR count])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • substr: Substring to replace in str.
  • pos: The start position to start the replace in str.
  • count: The count of characters to be replaced from start position of str. If not specified, will use substr length instead.

Example

sql
> select overlay('Txxxxas' placing 'hom' from 2 for 4);
+--------------------------------------------------------+
| overlay(Utf8("Txxxxas"),Utf8("hom"),Int64(2),Int64(4)) |
+--------------------------------------------------------+
| Thomas                                                 |
+--------------------------------------------------------+

position

Alias of strpos.

repeat

Returns a string with an input string repeated a specified number.

sql
repeat(str, n)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • n: Number of times to repeat the input string.

Example

sql
> select repeat('data', 3);
+-------------------------------+
| repeat(Utf8("data"),Int64(3)) |
+-------------------------------+
| datadatadata                  |
+-------------------------------+

replace

Replaces all occurrences of a specified substring in a string with a new substring.

sql
replace(str, substr, replacement)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • substr: Substring expression to replace in the input string. Substring expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • replacement: Replacement substring expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select replace('ABabbaBA', 'ab', 'cd');
+-------------------------------------------------+
| replace(Utf8("ABabbaBA"),Utf8("ab"),Utf8("cd")) |
+-------------------------------------------------+
| ABcdbaBA                                        |
+-------------------------------------------------+

reverse

Reverses the character order of a string.

sql
reverse(str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select reverse('datafusion');
+-----------------------------+
| reverse(Utf8("datafusion")) |
+-----------------------------+
| noisufatad                  |
+-----------------------------+

Returns a specified number of characters from the right side of a string.

sql
right(str, n)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • n: Number of characters to return.

Example

sql
> select right('datafusion', 6);
+------------------------------------+
| right(Utf8("datafusion"),Int64(6)) |
+------------------------------------+
| fusion                             |
+------------------------------------+

Related functions:

rpad

Pads the right side of a string with another string to a specified string length.

sql
rpad(str, n[, padding_str])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • n: String length to pad to. If the input string is longer than this length, it is truncated.
  • padding_str: String expression to pad with. Can be a constant, column, or function, and any combination of string operators. Default is a space.

Example

sql
>  select rpad('datafusion', 20, '_-');
+-----------------------------------------------+
| rpad(Utf8("datafusion"),Int64(20),Utf8("_-")) |
+-----------------------------------------------+
| datafusion_-_-_-_-_-                          |
+-----------------------------------------------+

Related functions:

rtrim

Trims the specified trim string from the end of a string. If no trim string is provided, all spaces are removed from the end of the input string.

sql
rtrim(str[, trim_str])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • trim_str: String expression to trim from the end of the input string. Can be a constant, column, or function, and any combination of arithmetic operators. Default is a space.

Example

sql
> select rtrim('  datafusion  ');
+-------------------------------+
| rtrim(Utf8("  datafusion  ")) |
+-------------------------------+
|   datafusion                  |
+-------------------------------+
> select rtrim('___datafusion___', '_');
+-------------------------------------------+
| rtrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| ___datafusion                             |
+-------------------------------------------+

Alternative Syntax

sql
trim(TRAILING trim_str FROM str)

Related functions:

split_part

Splits a string based on a specified delimiter and returns the substring in the specified position.

sql
split_part(str, delimiter, pos)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • delimiter: String or character to split on.
  • pos: Position of the part to return (counting from 1). Negative values count backward from the end of the string.

Example

sql
> select split_part('1.2.3.4.5', '.', 3);
+--------------------------------------------------+
| split_part(Utf8("1.2.3.4.5"),Utf8("."),Int64(3)) |
+--------------------------------------------------+
| 3                                                |
+--------------------------------------------------+

starts_with

Tests if a string starts with a substring.

sql
starts_with(str, substr)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • substr: Substring to test for.

Example

sql
> select starts_with('datafusion','data');
+----------------------------------------------+
| starts_with(Utf8("datafusion"),Utf8("data")) |
+----------------------------------------------+
| true                                         |
+----------------------------------------------+

strpos

Returns the starting position of a specified substring in a string. Positions begin at 1. If the substring does not exist in the string, the function returns 0.

sql
strpos(str, substr)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • substr: Substring expression to search for.

Example

sql
> select strpos('datafusion', 'fus');
+----------------------------------------+
| strpos(Utf8("datafusion"),Utf8("fus")) |
+----------------------------------------+
| 5                                      |
+----------------------------------------+

Alternative Syntax

sql
position(substr in origstr)

Aliases

  • instr
  • position

substr

Extracts a substring of a specified number of characters from a specific starting position in a string.

sql
substr(str, start_pos[, length])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • start_pos: Character position to start the substring at. The first character in the string has a position of 1.
  • length: Number of characters to extract. If not specified, returns the rest of the string after the start position.

Example

sql
> select substr('datafusion', 5, 3);
+----------------------------------------------+
| substr(Utf8("datafusion"),Int64(5),Int64(3)) |
+----------------------------------------------+
| fus                                          |
+----------------------------------------------+

Alternative Syntax

sql
substring(str from start_pos for length)

Aliases

  • substring

substr_index

Returns the substring from str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned.

sql
substr_index(str, delim, count)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • delim: The string to find in str to split str.
  • count: The number of times to search for the delimiter. Can be either a positive or negative number.

Example

sql
> select substr_index('www.apache.org', '.', 1);
+---------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(1)) |
+---------------------------------------------------------+
| www                                                     |
+---------------------------------------------------------+
> select substr_index('www.apache.org', '.', -1);
+----------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(-1)) |
+----------------------------------------------------------+
| org                                                      |
+----------------------------------------------------------+

Aliases

  • substring_index

substring

Alias of substr.

substring_index

Alias of substr_index.

to_hex

Converts an integer to a hexadecimal string.

sql
to_hex(int)

Arguments

  • int: Integer expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select to_hex(12345689);
+-------------------------+
| to_hex(Int64(12345689)) |
+-------------------------+
| bc6159                  |
+-------------------------+

translate

Performs character-wise substitution based on a mapping.

sql
translate(str, from, to)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • from: The characters to be replaced.
  • to: The characters to replace them with. Each character in from that is found in str is replaced by the character at the same index in to. Any characters in from that don't have a corresponding character in to are removed. If a character appears more than once in from, the first occurrence determines the mapping.

Example

sql
> select translate('twice', 'wic', 'her');
+--------------------------------------------------+
| translate(Utf8("twice"),Utf8("wic"),Utf8("her")) |
+--------------------------------------------------+
| there                                            |
+--------------------------------------------------+

trim

Alias of btrim.

upper

Converts a string to upper-case.

sql
upper(str)

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select upper('dataFusion');
+---------------------------+
| upper(Utf8("dataFusion")) |
+---------------------------+
| DATAFUSION                |
+---------------------------+

Related functions:

uuid

Returns UUID v4 string value which is unique per row.

sql
uuid()

Example

sql
> select uuid();
+--------------------------------------+
| uuid()                               |
+--------------------------------------+
| 6ec17ef8-1934-41cc-8d59-d0c8f9eea1f0 |
+--------------------------------------+

Binary String Functions

decode

Decode binary data from textual representation in string.

sql
decode(expression, format)

Arguments

  • expression: Expression containing encoded string data
  • format: Same arguments as encode

Related functions:

encode

Encode binary data into a textual representation.

sql
encode(expression, format)

Arguments

  • expression: Expression containing string or binary data
  • format: Supported formats are: base64, base64pad, hex

Related functions:

Regular Expression Functions

Apache DataFusion uses a PCRE-like regular expression syntax (minus support for several features including look-around and backreferences). The following regular expression functions are supported:

regexp_count

Returns the number of matches that a regular expression has in a string.

sql
regexp_count(str, regexp[, start, flags])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • start: - start: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function.
  • flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
    • i: case-insensitive: letters match both upper and lower case
    • m: multi-line mode: ^ and $ match begin/end of line
    • s: allow . to match \n
    • R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
    • U: swap the meaning of x* and x*?

Example

sql
> select regexp_count('abcAbAbc', 'abc', 2, 'i');
+---------------------------------------------------------------+
| regexp_count(Utf8("abcAbAbc"),Utf8("abc"),Int64(2),Utf8("i")) |
+---------------------------------------------------------------+
| 1                                                             |
+---------------------------------------------------------------+

regexp_instr

Returns the position in a string where the specified occurrence of a POSIX regular expression is located.

sql
regexp_instr(str, regexp[, start[, N[, flags[, subexpr]]]])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • start: - start: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function. Defaults to 1
  • N: - N: Optional The N-th occurrence of pattern to find. Defaults to 1 (first match). Can be a constant, column, or function.
  • flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
    • i: case-insensitive: letters match both upper and lower case
    • m: multi-line mode: ^ and $ match begin/end of line
    • s: allow . to match \n
    • R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
    • U: swap the meaning of x* and x*?
  • subexpr: Optional Specifies which capture group (subexpression) to return the position for. Defaults to 0, which returns the position of the entire match.

Example

sql
> SELECT regexp_instr('ABCDEF', 'C(.)(..)');
+---------------------------------------------------------------+
| regexp_instr(Utf8("ABCDEF"),Utf8("C(.)(..)"))                 |
+---------------------------------------------------------------+
| 3                                                             |
+---------------------------------------------------------------+

regexp_like

Returns true if a regular expression has at least one match in a string, false otherwise.

sql
regexp_like(str, regexp[, flags])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
    • i: case-insensitive: letters match both upper and lower case
    • m: multi-line mode: ^ and $ match begin/end of line
    • s: allow . to match \n
    • R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
    • U: swap the meaning of x* and x*?

Example

sql
select regexp_like('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
+--------------------------------------------------------+
| regexp_like(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
+--------------------------------------------------------+
| true                                                   |
+--------------------------------------------------------+
SELECT regexp_like('aBc', '(b|d)', 'i');
+--------------------------------------------------+
| regexp_like(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
+--------------------------------------------------+
| true                                             |
+--------------------------------------------------+

Additional examples can be found here

regexp_match

Returns the first regular expression matches in a string.

sql
regexp_match(str, regexp[, flags])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • regexp: Regular expression to match against. Can be a constant, column, or function.
  • flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
    • i: case-insensitive: letters match both upper and lower case
    • m: multi-line mode: ^ and $ match begin/end of line
    • s: allow . to match \n
    • R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
    • U: swap the meaning of x* and x*?

Example

sql
            > select regexp_match('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
            +---------------------------------------------------------+
            | regexp_match(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
            +---------------------------------------------------------+
            | [Köln]                                                  |
            +---------------------------------------------------------+
            SELECT regexp_match('aBc', '(b|d)', 'i');
            +---------------------------------------------------+
            | regexp_match(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
            +---------------------------------------------------+
            | [B]                                               |
            +---------------------------------------------------+

Additional examples can be found here

regexp_replace

Replaces substrings in a string that match a regular expression.

sql
regexp_replace(str, regexp, replacement[, flags])

Arguments

  • str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • regexp: Regular expression to match against. Can be a constant, column, or function.
  • replacement: Replacement string expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
  • g: (global) Search globally and don't return after the first match
  • i: case-insensitive: letters match both upper and lower case
  • m: multi-line mode: ^ and $ match begin/end of line
  • s: allow . to match \n
  • R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
  • U: swap the meaning of x* and x*?

Example

sql
> select regexp_replace('foobarbaz', 'b(..)', 'X\\1Y', 'g');
+------------------------------------------------------------------------+
| regexp_replace(Utf8("foobarbaz"),Utf8("b(..)"),Utf8("X\1Y"),Utf8("g")) |
+------------------------------------------------------------------------+
| fooXarYXazY                                                            |
+------------------------------------------------------------------------+
SELECT regexp_replace('aBc', '(b|d)', 'Ab\\1a', 'i');
+-------------------------------------------------------------------+
| regexp_replace(Utf8("aBc"),Utf8("(b|d)"),Utf8("Ab\1a"),Utf8("i")) |
+-------------------------------------------------------------------+
| aAbBac                                                            |
+-------------------------------------------------------------------+

Additional examples can be found here

Time and Date Functions

current_date

Returns the current date in the session time zone.

The current_date() return value is determined at query time and will return the same date, no matter when in the query plan the function executes.

sql
current_date()
    (optional) SET datafusion.execution.time_zone = '+00:00';
    SELECT current_date();

Aliases

  • today

current_time

Returns the current time in the session time zone.

The current_time() return value is determined at query time and will return the same time, no matter when in the query plan the function executes.

The session time zone can be set using the statement 'SET datafusion.execution.time_zone = desired time zone'. The time zone can be a value like +00:00, 'Europe/London' etc.

sql
current_time()
    (optional) SET datafusion.execution.time_zone = '+00:00';
    SELECT current_time();

current_timestamp

Alias of now.

date_bin

Calculates time intervals and returns the start of the interval nearest to the specified timestamp. Use date_bin to downsample time series data by grouping rows into time-based "bins" or "windows" and applying an aggregate or selector function to each window.

For example, if you "bin" or "window" data into 15 minute intervals, an input timestamp of 2023-01-01T18:18:18Z will be updated to the start time of the 15 minute bin it is in: 2023-01-01T18:15:00Z.

sql
date_bin(interval, expression, origin-timestamp)

Arguments

  • interval: Bin interval.

  • expression: Time expression to operate on. Can be a constant, column, or function.

  • origin-timestamp: Optional. Starting point used to determine bin boundaries. If not specified defaults 1970-01-01T00:00:00Z (the UNIX epoch in UTC). The following intervals are supported:

    • nanoseconds
    • microseconds
    • milliseconds
    • seconds
    • minutes
    • hours
    • days
    • weeks
    • months
    • years
    • century

Example

sql
-- Bin the timestamp into 1 day intervals
> SELECT date_bin(interval '1 day', time) as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z')  t(time);
+---------------------+
| bin                 |
+---------------------+
| 2023-01-01T00:00:00 |
| 2023-01-03T00:00:00 |
+---------------------+
2 row(s) fetched.

-- Bin the timestamp into 1 day intervals starting at 3AM on  2023-01-01
> SELECT date_bin(interval '1 day', time,  '2023-01-01T03:00:00') as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z')  t(time);
+---------------------+
| bin                 |
+---------------------+
| 2023-01-01T03:00:00 |
| 2023-01-03T03:00:00 |
+---------------------+
2 row(s) fetched.

-- Bin the time into 15 minute intervals starting at 1 min
>  SELECT date_bin(interval '15 minutes', time, TIME '00:01:00') as bin
FROM VALUES (TIME '02:18:18'), (TIME '19:00:03')  t(time);
+----------+
| bin      |
+----------+
| 02:16:00 |
| 18:46:00 |
+----------+
2 row(s) fetched.

date_format

Alias of to_char.

date_part

Returns the specified part of the date as an integer.

sql
date_part(part, expression)

Arguments

  • part: Part of the date to return. The following date parts are supported:

    • year
    • isoyear (ISO 8601 week-numbering year)
    • quarter (emits value in inclusive range [1, 4] based on which quartile of the year the date is in)
    • month
    • week (week of the year)
    • day (day of the month)
    • hour
    • minute
    • second
    • millisecond
    • microsecond
    • nanosecond
    • dow (day of the week where Sunday is 0)
    • doy (day of the year)
    • epoch (seconds since Unix epoch for timestamps/dates, total seconds for intervals)
    • isodow (day of the week where Monday is 0)
  • expression: Time expression to operate on. Can be a constant, column, or function.

Alternative Syntax

sql
extract(field FROM source)

Aliases

  • datepart

date_trunc

Truncates a timestamp or time value to a specified precision.

sql
date_trunc(precision, expression)

Arguments

  • precision: Time precision to truncate to. The following precisions are supported:

    For Timestamp types:

    • year / YEAR
    • quarter / QUARTER
    • month / MONTH
    • week / WEEK
    • day / DAY
    • hour / HOUR
    • minute / MINUTE
    • second / SECOND
    • millisecond / MILLISECOND
    • microsecond / MICROSECOND

    For Time types (hour, minute, second, millisecond, microsecond only):

    • hour / HOUR
    • minute / MINUTE
    • second / SECOND
    • millisecond / MILLISECOND
    • microsecond / MICROSECOND
  • expression: Timestamp or time expression to operate on. Can be a constant, column, or function.

Aliases

  • datetrunc

datepart

Alias of date_part.

datetrunc

Alias of date_trunc.

from_unixtime

Converts an integer to RFC3339 timestamp format (YYYY-MM-DDT00:00:00.000000000Z). Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z) return the corresponding timestamp.

sql
from_unixtime(expression[, timezone])

Arguments

  • expression: The expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • timezone: Optional timezone to use when converting the integer to a timestamp. If not provided, the default timezone is UTC.

Example

sql
> select from_unixtime(1599572549, 'America/New_York');
+-----------------------------------------------------------+
| from_unixtime(Int64(1599572549),Utf8("America/New_York")) |
+-----------------------------------------------------------+
| 2020-09-08T09:42:29-04:00                                 |
+-----------------------------------------------------------+

make_date

Make a date from year/month/day component parts.

sql
make_date(year, month, day)

Arguments

  • year: Year to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
  • month: Month to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
  • day: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.

Example

sql
> select make_date(2023, 1, 31);
+-------------------------------------------+
| make_date(Int64(2023),Int64(1),Int64(31)) |
+-------------------------------------------+
| 2023-01-31                                |
+-------------------------------------------+
> select make_date('2023', '01', '31');
+-----------------------------------------------+
| make_date(Utf8("2023"),Utf8("01"),Utf8("31")) |
+-----------------------------------------------+
| 2023-01-31                                    |
+-----------------------------------------------+

Additional examples can be found here

make_time

Make a time from hour/minute/second component parts.

sql
make_time(hour, minute, second)

Arguments

  • hour: Hour to use when making the time. Can be a constant, column or function, and any combination of arithmetic operators.
  • minute: Minute to use when making the time. Can be a constant, column or function, and any combination of arithmetic operators.
  • second: Second to use when making the time. Can be a constant, column or function, and any combination of arithmetic operators.

Example

sql
> select make_time(13, 23, 1);
+-------------------------------------------+
| make_time(Int64(13),Int64(23),Int64(1))   |
+-------------------------------------------+
| 13:23:01                                  |
+-------------------------------------------+
> select make_time('23', '01', '31');
+-----------------------------------------------+
| make_time(Utf8("23"),Utf8("01"),Utf8("31"))   |
+-----------------------------------------------+
| 23:01:31                                      |
+-----------------------------------------------+

Additional examples can be found here

now

Returns the current timestamp in the system configured timezone (None by default).

The now() return value is determined at query time and will return the same timestamp, no matter when in the query plan the function executes.

sql
now()

Aliases

  • current_timestamp

to_char

Returns a string representation of a date, time, timestamp or duration based on a Chrono format. Unlike the PostgreSQL equivalent of this function numerical formatting is not supported.

sql
to_char(expression, format)

Arguments

  • expression: Expression to operate on. Can be a constant, column, or function that results in a date, time, timestamp or duration.
  • format: A Chrono format string to use to convert the expression.
  • day: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.

Example

sql
> select to_char('2023-03-01'::date, '%d-%m-%Y');
+----------------------------------------------+
| to_char(Utf8("2023-03-01"),Utf8("%d-%m-%Y")) |
+----------------------------------------------+
| 01-03-2023                                   |
+----------------------------------------------+

Additional examples can be found here

Aliases

  • date_format

to_date

Converts a value to a date (YYYY-MM-DD). Supports strings, numeric and timestamp types as input. Strings are parsed as YYYY-MM-DD (e.g. '2023-07-20') if no Chrono formats are provided. Integers and doubles are interpreted as days since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding date.

Note: to_date returns Date32, which represents its values as the number of days since unix epoch(1970-01-01) stored as signed 32 bit value. The largest supported date value is 9999-12-31.

sql
to_date('2017-05-31', '%Y-%m-%d')

Arguments

  • expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

sql
> select to_date('2023-01-31');
+-------------------------------+
| to_date(Utf8("2023-01-31")) |
+-------------------------------+
| 2023-01-31                    |
+-------------------------------+
> select to_date('2023/01/31', '%Y-%m-%d', '%Y/%m/%d');
+---------------------------------------------------------------------+
| to_date(Utf8("2023/01/31"),Utf8("%Y-%m-%d"),Utf8("%Y/%m/%d")) |
+---------------------------------------------------------------------+
| 2023-01-31                                                          |
+---------------------------------------------------------------------+

Additional examples can be found here

to_local_time

Converts a timestamp with a timezone to a timestamp without a timezone (with no offset or timezone information). This function handles daylight saving time changes.

sql
to_local_time(expression)

Arguments

  • expression: Time expression to operate on. Can be a constant, column, or function.

Example

sql
> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp);
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+

> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels');
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20                         |
+---------------------------------------------+

> SELECT
  time,
  arrow_typeof(time) as type,
  to_local_time(time) as to_local_time,
  arrow_typeof(to_local_time(time)) as to_local_time_type
FROM (
  SELECT '2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels' AS time
);
+---------------------------+----------------------------------+---------------------+--------------------+
| time                      | type                             | to_local_time       | to_local_time_type |
+---------------------------+----------------------------------+---------------------+--------------------+
| 2024-04-01T00:00:20+02:00 | Timestamp(ns, "Europe/Brussels") | 2024-04-01T00:00:20 | Timestamp(ns)      |
+---------------------------+----------------------------------+---------------------+--------------------+

# combine `to_local_time()` with `date_bin()` to bin on boundaries in the timezone rather
# than UTC boundaries

> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AS date_bin;
+---------------------+
| date_bin            |
+---------------------+
| 2024-04-01T00:00:00 |
+---------------------+

> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AT TIME ZONE 'Europe/Brussels' AS date_bin_with_timezone;
+---------------------------+
| date_bin_with_timezone    |
+---------------------------+
| 2024-04-01T00:00:00+02:00 |
+---------------------------+

to_time

Converts a value to a time (HH:MM:SS.nnnnnnnnn). Supports strings and timestamps as input. Strings are parsed as HH:MM:SS, HH:MM:SS.nnnnnnnnn, or HH:MM if no Chrono formats are provided. Timestamps will have the time portion extracted. Returns the corresponding time.

Note: to_time returns Time64(Nanosecond), which represents the time of day in nanoseconds since midnight.

sql
to_time('12:30:45', '%H:%M:%S')

Arguments

  • expression: String or Timestamp expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

sql
> select to_time('12:30:45');
+---------------------------+
| to_time(Utf8("12:30:45")) |
+---------------------------+
| 12:30:45                  |
+---------------------------+
> select to_time('12-30-45', '%H-%M-%S');
+--------------------------------------------+
| to_time(Utf8("12-30-45"),Utf8("%H-%M-%S")) |
+--------------------------------------------+
| 12:30:45                                   |
+--------------------------------------------+
> select to_time('2024-01-15 14:30:45'::timestamp);
+--------------------------------------------------+
| to_time(Utf8("2024-01-15 14:30:45"))             |
+--------------------------------------------------+
| 14:30:45                                         |
+--------------------------------------------------+

Additional examples can be found here

to_timestamp

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000<TZ>) in the session time zone. Supports strings, integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Strings that parse without a time zone are treated as if they are in the session time zone, or UTC if no session time zone is set. Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z).

Note: to_timestamp returns Timestamp(ns, TimeZone) where the time zone is the session time zone. The supported range for integer input is between-9223372037 and 9223372036. Supported range for string input is between 1677-09-21T00:12:44.0 and 2262-04-11T23:47:16.0. Please use to_timestamp_seconds for the input outside of supported bounds.

The session time zone can be set using the statement SET TIMEZONE = 'desired time zone'. The time zone can be a value like +00:00, 'Europe/London' etc.

sql
to_timestamp(expression[, ..., format_n])

Arguments

  • expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
  • format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is only supported at the end of the string preceded by a space.

Example

sql
> select to_timestamp('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------+
| to_timestamp(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------+
| 2023-01-31T14:26:56.123456789                             |
+-----------------------------------------------------------+
> select to_timestamp('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------+
| to_timestamp(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789                                                                          |
+--------------------------------------------------------------------------------------------------------+

Additional examples can be found here

to_timestamp_micros

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000<TZ>) in the session time zone. Supports strings, integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Strings that parse without a time zone are treated as if they are in the session time zone, or UTC if no session time zone is set. Integers, unsigned integers, and doubles are interpreted as microseconds since the unix epoch (1970-01-01T00:00:00Z).

The session time zone can be set using the statement SET TIMEZONE = 'desired time zone'. The time zone can be a value like +00:00, 'Europe/London' etc.

sql
to_timestamp_micros(expression[, ..., format_n])

Arguments

  • expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
  • format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is only supported at the end of the string preceded by a space.

Example

sql
> select to_timestamp_micros('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_micros(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123456                                       |
+------------------------------------------------------------------+
> select to_timestamp_micros('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_micros(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456                                                                                    |
+---------------------------------------------------------------------------------------------------------------+

Additional examples can be found here

to_timestamp_millis

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000<TZ>) in the session time zone. Supports strings, integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Strings that parse without a time zone are treated as if they are in the session time zone, or UTC if no session time zone is set. Integers, unsigned integers, and doubles are interpreted as milliseconds since the unix epoch (1970-01-01T00:00:00Z).

The session time zone can be set using the statement SET TIMEZONE = 'desired time zone'. The time zone can be a value like +00:00, 'Europe/London' etc.

sql
to_timestamp_millis(expression[, ..., format_n])

Arguments

  • expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
  • format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is only supported at the end of the string preceded by a space.

Example

sql
> select to_timestamp_millis('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_millis(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123                                          |
+------------------------------------------------------------------+
> select to_timestamp_millis('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_millis(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123                                                                                       |
+---------------------------------------------------------------------------------------------------------------+

Additional examples can be found here

to_timestamp_nanos

Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000000<TZ>) in the session time zone. Supports strings, integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Strings that parse without a time zone are treated as if they are in the session time zone. Integers, unsigned integers, and doubles are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z).

The session time zone can be set using the statement SET TIMEZONE = 'desired time zone'. The time zone can be a value like +00:00, 'Europe/London' etc.

sql
to_timestamp_nanos(expression[, ..., format_n])

Arguments

  • expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
  • format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is only supported at the end of the string preceded by a space.

Example

sql
> select to_timestamp_nanos('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------------+
| to_timestamp_nanos(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------------+
| 2023-01-31T14:26:56.123456789                                   |
+-----------------------------------------------------------------+
> select to_timestamp_nanos('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------------+
| to_timestamp_nanos(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789                                                                                |
+---------------------------------------------------------------------------------------------------------------+

Additional examples can be found here

to_timestamp_seconds

Converts a value to a timestamp (YYYY-MM-DDT00:00:00<TZ>) in the session time zone. Supports strings, integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Strings that parse without a time zone are treated as if they are in the session time zone, or UTC if no session time zone is set. Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z).

The session time zone can be set using the statement SET TIMEZONE = 'desired time zone'. The time zone can be a value like +00:00, 'Europe/London' etc.

sql
to_timestamp_seconds(expression[, ..., format_n])

Arguments

  • expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
  • format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned. Note: parsing of named timezones (e.g. 'America/New_York') using %Z is only supported at the end of the string preceded by a space.

Example

sql
> select to_timestamp_seconds('2023-01-31T09:26:56.123456789-05:00');
+-------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-------------------------------------------------------------------+
| 2023-01-31T14:26:56                                               |
+-------------------------------------------------------------------+
> select to_timestamp_seconds('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+----------------------------------------------------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+----------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00                                                                                            |
+----------------------------------------------------------------------------------------------------------------+

Additional examples can be found here

to_unixtime

Converts a value to seconds since the unix epoch (1970-01-01T00:00:00). Supports strings, dates, timestamps, integer, unsigned integer, and float types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers, unsigned integers, and floats are interpreted as seconds since the unix epoch (1970-01-01T00:00:00).

sql
to_unixtime(expression[, ..., format_n])

Arguments

  • expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
  • format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.

Example

sql
> select to_unixtime('2020-09-08T12:00:00+00:00');
+------------------------------------------------+
| to_unixtime(Utf8("2020-09-08T12:00:00+00:00")) |
+------------------------------------------------+
| 1599566400                                     |
+------------------------------------------------+
> select to_unixtime('01-14-2023 01:01:30+05:30', '%q', '%d-%m-%Y %H/%M/%S', '%+', '%m-%d-%Y %H:%M:%S%#z');
+-----------------------------------------------------------------------------------------------------------------------------+
| to_unixtime(Utf8("01-14-2023 01:01:30+05:30"),Utf8("%q"),Utf8("%d-%m-%Y %H/%M/%S"),Utf8("%+"),Utf8("%m-%d-%Y %H:%M:%S%#z")) |
+-----------------------------------------------------------------------------------------------------------------------------+
| 1673638290                                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------+

today

Alias of current_date.

Array Functions

array_any_value

Returns the first non-null element in the array.

sql
array_any_value(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_any_value([NULL, 1, 2, 3]);
+-------------------------------+
| array_any_value(List([NULL,1,2,3])) |
+-------------------------------------+
| 1                                   |
+-------------------------------------+

Aliases

  • list_any_value

array_append

Appends an element to the end of an array.

sql
array_append(array, element)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Element to append to the array.

Example

sql
> select array_append([1, 2, 3], 4);
+--------------------------------------+
| array_append(List([1,2,3]),Int64(4)) |
+--------------------------------------+
| [1, 2, 3, 4]                         |
+--------------------------------------+

Aliases

  • list_append
  • array_push_back
  • list_push_back

array_cat

Alias of array_concat.

array_concat

Concatenates arrays.

sql
array_concat(array[, ..., array_n])

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • array_n: Subsequent array column or literal array to concatenate.

Example

sql
> select array_concat([1, 2], [3, 4], [5, 6]);
+---------------------------------------------------+
| array_concat(List([1,2]),List([3,4]),List([5,6])) |
+---------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                                |
+---------------------------------------------------+

Aliases

  • array_cat
  • list_concat
  • list_cat

array_contains

Alias of array_has.

array_dims

Returns an array of the array's dimensions.

sql
array_dims(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_dims([[1, 2, 3], [4, 5, 6]]);
+---------------------------------+
| array_dims(List([1,2,3,4,5,6])) |
+---------------------------------+
| [2, 3]                          |
+---------------------------------+

Aliases

  • list_dims

array_distance

Returns the Euclidean distance between two input arrays of equal length.

sql
array_distance(array1, array2)

Arguments

  • array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • array2: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_distance([1, 2], [1, 4]);
+------------------------------------+
| array_distance(List([1,2], [1,4])) |
+------------------------------------+
| 2.0                                |
+------------------------------------+

Aliases

  • list_distance

array_distinct

Returns distinct values from the array after removing duplicates.

sql
array_distinct(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_distinct([1, 3, 2, 3, 1, 2, 4]);
+---------------------------------+
| array_distinct(List([1,2,3,4])) |
+---------------------------------+
| [1, 2, 3, 4]                    |
+---------------------------------+

Aliases

  • list_distinct

array_element

Extracts the element with the index n from the array.

sql
array_element(array, index)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • index: Index to extract the element from the array.

Example

sql
> select array_element([1, 2, 3, 4], 3);
+-----------------------------------------+
| array_element(List([1,2,3,4]),Int64(3)) |
+-----------------------------------------+
| 3                                       |
+-----------------------------------------+

Aliases

  • array_extract
  • list_element
  • list_extract

array_empty

Alias of empty.

array_except

Returns an array of the elements that appear in the first array but not in the second.

sql
array_except(array1, array2)

Arguments

  • array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • array2: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_except([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_except([1, 2, 3, 4], [5, 6, 3, 4]);           |
+----------------------------------------------------+
| [1, 2]                                              |
+----------------------------------------------------+
> select array_except([1, 2, 3, 4], [3, 4, 5, 6]);
+----------------------------------------------------+
| array_except([1, 2, 3, 4], [3, 4, 5, 6]);           |
+----------------------------------------------------+
| [1, 2]                                              |
+----------------------------------------------------+

Aliases

  • list_except

array_extract

Alias of array_element.

array_has

Returns true if the array contains the element.

sql
array_has(array, element)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Scalar or Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_has([1, 2, 3], 2);
+-----------------------------+
| array_has(List([1,2,3]), 2) |
+-----------------------------+
| true                        |
+-----------------------------+

Aliases

  • list_has
  • array_contains
  • list_contains

array_has_all

Returns true if all elements of sub-array exist in array.

sql
array_has_all(array, sub-array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • sub-array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_has_all([1, 2, 3, 4], [2, 3]);
+--------------------------------------------+
| array_has_all(List([1,2,3,4]), List([2,3])) |
+--------------------------------------------+
| true                                       |
+--------------------------------------------+

Aliases

  • list_has_all

array_has_any

Returns true if the arrays have any elements in common.

sql
array_has_any(array1, array2)

Arguments

  • array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • array2: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_has_any([1, 2, 3], [3, 4]);
+------------------------------------------+
| array_has_any(List([1,2,3]), List([3,4])) |
+------------------------------------------+
| true                                     |
+------------------------------------------+

Aliases

  • list_has_any
  • arrays_overlap

array_indexof

Alias of array_position.

array_intersect

Returns an array of elements in the intersection of array1 and array2.

sql
array_intersect(array1, array2)

Arguments

  • array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • array2: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_intersect([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_intersect([1, 2, 3, 4], [5, 6, 3, 4]);       |
+----------------------------------------------------+
| [3, 4]                                             |
+----------------------------------------------------+
> select array_intersect([1, 2, 3, 4], [5, 6, 7, 8]);
+----------------------------------------------------+
| array_intersect([1, 2, 3, 4], [5, 6, 7, 8]);       |
+----------------------------------------------------+
| []                                                 |
+----------------------------------------------------+

Aliases

  • list_intersect

array_join

Alias of array_to_string.

array_length

Returns the length of the array dimension.

sql
array_length(array, dimension)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • dimension: Array dimension.

Example

sql
> select array_length([1, 2, 3, 4, 5], 1);
+-------------------------------------------+
| array_length(List([1,2,3,4,5]), 1)        |
+-------------------------------------------+
| 5                                         |
+-------------------------------------------+

Aliases

  • list_length

array_max

Returns the maximum value in the array.

sql
array_max(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_max([3,1,4,2]);
+-----------------------------------------+
| array_max(List([3,1,4,2]))              |
+-----------------------------------------+
| 4                                       |
+-----------------------------------------+

Aliases

  • list_max

array_min

Returns the minimum value in the array.

sql
array_min(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_min([3,1,4,2]);
+-----------------------------------------+
| array_min(List([3,1,4,2]))              |
+-----------------------------------------+
| 1                                       |
+-----------------------------------------+

array_ndims

Returns the number of dimensions of the array.

sql
array_ndims(array, element)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Array element.

Example

sql
> select array_ndims([[1, 2, 3], [4, 5, 6]]);
+----------------------------------+
| array_ndims(List([1,2,3,4,5,6])) |
+----------------------------------+
| 2                                |
+----------------------------------+

Aliases

  • list_ndims

array_pop_back

Returns the array without the last element.

sql
array_pop_back(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_pop_back([1, 2, 3]);
+-------------------------------+
| array_pop_back(List([1,2,3])) |
+-------------------------------+
| [1, 2]                        |
+-------------------------------+

Aliases

  • list_pop_back

array_pop_front

Returns the array without the first element.

sql
array_pop_front(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_pop_front([1, 2, 3]);
+-------------------------------+
| array_pop_front(List([1,2,3])) |
+-------------------------------+
| [2, 3]                        |
+-------------------------------+

Aliases

  • list_pop_front

array_position

Returns the position of the first occurrence of the specified element in the array, or NULL if not found. Comparisons are done using IS DISTINCT FROM semantics, so NULL is considered to match NULL.

sql
array_position(array, element)
array_position(array, element, index)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Element to search for in the array.
  • index: Index at which to start searching (1-indexed).

Example

sql
> select array_position([1, 2, 2, 3, 1, 4], 2);
+----------------------------------------------+
| array_position(List([1,2,2,3,1,4]),Int64(2)) |
+----------------------------------------------+
| 2                                            |
+----------------------------------------------+
> select array_position([1, 2, 2, 3, 1, 4], 2, 3);
+----------------------------------------------------+
| array_position(List([1,2,2,3,1,4]),Int64(2), Int64(3)) |
+----------------------------------------------------+
| 3                                                  |
+----------------------------------------------------+

Aliases

  • list_position
  • array_indexof
  • list_indexof

array_positions

Searches for an element in the array, returns all occurrences.

sql
array_positions(array, element)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Element to search for position in the array.

Example

sql
> select array_positions([1, 2, 2, 3, 1, 4], 2);
+-----------------------------------------------+
| array_positions(List([1,2,2,3,1,4]),Int64(2)) |
+-----------------------------------------------+
| [2, 3]                                        |
+-----------------------------------------------+

Aliases

  • list_positions

array_prepend

Prepends an element to the beginning of an array.

sql
array_prepend(element, array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Element to prepend to the array.

Example

sql
> select array_prepend(1, [2, 3, 4]);
+---------------------------------------+
| array_prepend(Int64(1),List([2,3,4])) |
+---------------------------------------+
| [1, 2, 3, 4]                          |
+---------------------------------------+

Aliases

  • list_prepend
  • array_push_front
  • list_push_front

array_push_back

Alias of array_append.

array_push_front

Alias of array_prepend.

array_remove

Removes the first element from the array equal to the given value.

sql
array_remove(array, element)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Element to be removed from the array.

Example

sql
> select array_remove([1, 2, 2, 3, 2, 1, 4], 2);
+----------------------------------------------+
| array_remove(List([1,2,2,3,2,1,4]),Int64(2)) |
+----------------------------------------------+
| [1, 2, 3, 2, 1, 4]                           |
+----------------------------------------------+

Aliases

  • list_remove

array_remove_all

Removes all elements from the array equal to the given value.

sql
array_remove_all(array, element)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Element to be removed from the array.

Example

sql
> select array_remove_all([1, 2, 2, 3, 2, 1, 4], 2);
+--------------------------------------------------+
| array_remove_all(List([1,2,2,3,2,1,4]),Int64(2)) |
+--------------------------------------------------+
| [1, 3, 1, 4]                                     |
+--------------------------------------------------+

Aliases

  • list_remove_all

array_remove_n

Removes the first max elements from the array equal to the given value.

sql
array_remove_n(array, element, max))

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • element: Element to be removed from the array.
  • max: Number of first occurrences to remove.

Example

sql
> select array_remove_n([1, 2, 2, 3, 2, 1, 4], 2, 2);
+---------------------------------------------------------+
| array_remove_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(2)) |
+---------------------------------------------------------+
| [1, 3, 2, 1, 4]                                         |
+---------------------------------------------------------+

Aliases

  • list_remove_n

array_repeat

Returns an array containing element count times.

sql
array_repeat(element, count)

Arguments

  • element: Element expression. Can be a constant, column, or function, and any combination of array operators.
  • count: Value of how many times to repeat the element.

Example

sql
> select array_repeat(1, 3);
+---------------------------------+
| array_repeat(Int64(1),Int64(3)) |
+---------------------------------+
| [1, 1, 1]                       |
+---------------------------------+
> select array_repeat([1, 2], 2);
+------------------------------------+
| array_repeat(List([1,2]),Int64(2)) |
+------------------------------------+
| [[1, 2], [1, 2]]                   |
+------------------------------------+

Aliases

  • list_repeat

array_replace

Replaces the first occurrence of the specified element with another specified element.

sql
array_replace(array, from, to)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • from: Initial element.
  • to: Final element.

Example

sql
> select array_replace([1, 2, 2, 3, 2, 1, 4], 2, 5);
+--------------------------------------------------------+
| array_replace(List([1,2,2,3,2,1,4]),Int64(2),Int64(5)) |
+--------------------------------------------------------+
| [1, 5, 2, 3, 2, 1, 4]                                  |
+--------------------------------------------------------+

Aliases

  • list_replace

array_replace_all

Replaces all occurrences of the specified element with another specified element.

sql
array_replace_all(array, from, to)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • from: Initial element.
  • to: Final element.

Example

sql
> select array_replace_all([1, 2, 2, 3, 2, 1, 4], 2, 5);
+------------------------------------------------------------+
| array_replace_all(List([1,2,2,3,2,1,4]),Int64(2),Int64(5)) |
+------------------------------------------------------------+
| [1, 5, 5, 3, 5, 1, 4]                                      |
+------------------------------------------------------------+

Aliases

  • list_replace_all

array_replace_n

Replaces the first max occurrences of the specified element with another specified element.

sql
array_replace_n(array, from, to, max)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • from: Initial element.
  • to: Final element.
  • max: Number of first occurrences to replace.

Example

sql
> select array_replace_n([1, 2, 2, 3, 2, 1, 4], 2, 5, 2);
+-------------------------------------------------------------------+
| array_replace_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(5),Int64(2)) |
+-------------------------------------------------------------------+
| [1, 5, 5, 3, 2, 1, 4]                                             |
+-------------------------------------------------------------------+

Aliases

  • list_replace_n

array_resize

Resizes the list to contain size elements. Initializes new elements with value or empty if value is not set.

sql
array_resize(array, size, value)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • size: New size of given array.
  • value: Defines new elements' value or empty if value is not set.

Example

sql
> select array_resize([1, 2, 3], 5, 0);
+-------------------------------------+
| array_resize(List([1,2,3],5,0))     |
+-------------------------------------+
| [1, 2, 3, 0, 0]                     |
+-------------------------------------+

Aliases

  • list_resize

array_reverse

Returns the array with the order of the elements reversed.

sql
array_reverse(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_reverse([1, 2, 3, 4]);
+------------------------------------------------------------+
| array_reverse(List([1, 2, 3, 4]))                          |
+------------------------------------------------------------+
| [4, 3, 2, 1]                                               |
+------------------------------------------------------------+

Aliases

  • list_reverse

array_slice

Returns a slice of the array based on 1-indexed start and end positions.

sql
array_slice(array, begin, end)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • begin: Index of the first element. If negative, it counts backward from the end of the array.
  • end: Index of the last element. If negative, it counts backward from the end of the array.
  • stride: Stride of the array slice. The default is 1.

Example

sql
> select array_slice([1, 2, 3, 4, 5, 6, 7, 8], 3, 6);
+--------------------------------------------------------+
| array_slice(List([1,2,3,4,5,6,7,8]),Int64(3),Int64(6)) |
+--------------------------------------------------------+
| [3, 4, 5, 6]                                           |
+--------------------------------------------------------+

Aliases

  • list_slice

array_sort

Sort array.

sql
array_sort(array, desc, nulls_first)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • desc: Whether to sort in descending order(ASC or DESC).
  • nulls_first: Whether to sort nulls first(NULLS FIRST or NULLS LAST).

Example

sql
> select array_sort([3, 1, 2]);
+-----------------------------+
| array_sort(List([3,1,2]))   |
+-----------------------------+
| [1, 2, 3]                   |
+-----------------------------+

Aliases

  • list_sort

array_to_string

Converts each element to its text representation.

sql
array_to_string(array, delimiter[, null_string])

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • delimiter: Array element separator.
  • null_string: Optional. String to use for null values in the output. If not provided, nulls will be omitted.

Example

sql
> select array_to_string([[1, 2, 3, 4], [5, 6, 7, 8]], ',');
+----------------------------------------------------+
| array_to_string(List([1,2,3,4,5,6,7,8]),Utf8(",")) |
+----------------------------------------------------+
| 1,2,3,4,5,6,7,8                                    |
+----------------------------------------------------+

Aliases

  • list_to_string
  • array_join
  • list_join

array_union

Returns an array of elements that are present in both arrays (all elements from both arrays) without duplicates.

sql
array_union(array1, array2)

Arguments

  • array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
  • array2: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select array_union([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_union([1, 2, 3, 4], [5, 6, 3, 4]);           |
+----------------------------------------------------+
| [1, 2, 3, 4, 5, 6]                                 |
+----------------------------------------------------+
> select array_union([1, 2, 3, 4], [5, 6, 7, 8]);
+----------------------------------------------------+
| array_union([1, 2, 3, 4], [5, 6, 7, 8]);           |
+----------------------------------------------------+
| [1, 2, 3, 4, 5, 6, 7, 8]                           |
+----------------------------------------------------+

Aliases

  • list_union

arrays_overlap

Alias of array_has_any.

arrays_zip

Returns an array of structs created by combining the elements of each input array at the same index. If the arrays have different lengths, shorter arrays are padded with NULLs.

sql
arrays_zip(array1, array2[, ..., array_n])

Arguments

  • array1: First array expression.
  • array2: Second array expression.
  • array_n: Subsequent array expressions.

Example

sql
> select arrays_zip([1, 2, 3], ['a', 'b', 'c']);
+---------------------------------------------------+
| arrays_zip([1, 2, 3], ['a', 'b', 'c'])             |
+---------------------------------------------------+
| [{c0: 1, c1: a}, {c0: 2, c1: b}, {c0: 3, c1: c}] |
+---------------------------------------------------+
> select arrays_zip([1, 2], [3, 4, 5]);
+---------------------------------------------------+
| arrays_zip([1, 2], [3, 4, 5])                       |
+---------------------------------------------------+
| [{c0: 1, c1: 3}, {c0: 2, c1: 4}, {c0: , c1: 5}]  |
+---------------------------------------------------+

Aliases

  • list_zip

cardinality

Returns the total number of elements in the array.

sql
cardinality(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select cardinality([[1, 2, 3, 4], [5, 6, 7, 8]]);
+--------------------------------------+
| cardinality(List([1,2,3,4,5,6,7,8])) |
+--------------------------------------+
| 8                                    |
+--------------------------------------+

empty

Returns 1 for an empty array or 0 for a non-empty array.

sql
empty(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select empty([1]);
+------------------+
| empty(List([1])) |
+------------------+
| 0                |
+------------------+

Aliases

  • array_empty
  • list_empty

flatten

Converts an array of arrays to a flat array.

  • Applies to any depth of nested arrays
  • Does not change arrays that are already flat

The flattened array contains all the elements from all source arrays.

sql
flatten(array)

Arguments

  • array: Array expression. Can be a constant, column, or function, and any combination of array operators.

Example

sql
> select flatten([[1, 2], [3, 4]]);
+------------------------------+
| flatten(List([1,2], [3,4]))  |
+------------------------------+
| [1, 2, 3, 4]                 |
+------------------------------+

generate_series

Similar to the range function, but it includes the upper bound.

sql
generate_series(stop)
generate_series(start, stop[, step])

Arguments

  • start: Start of the series. Ints, timestamps, dates or string types that can be coerced to Date32 are supported.
  • end: End of the series (included). Type must be the same as start.
  • step: Increase by step (can not be 0). Steps less than a day are supported only for timestamp ranges.

Example

sql
> select generate_series(1,3);
+------------------------------------+
| generate_series(Int64(1),Int64(3)) |
+------------------------------------+
| [1, 2, 3]                          |
+------------------------------------+

list_any_value

Alias of array_any_value.

list_append

Alias of array_append.

list_cat

Alias of array_concat.

list_concat

Alias of array_concat.

list_contains

Alias of array_has.

list_dims

Alias of array_dims.

list_distance

Alias of array_distance.

list_distinct

Alias of array_distinct.

list_element

Alias of array_element.

list_empty

Alias of empty.

list_except

Alias of array_except.

list_extract

Alias of array_element.

list_has

Alias of array_has.

list_has_all

Alias of array_has_all.

list_has_any

Alias of array_has_any.

list_indexof

Alias of array_position.

list_intersect

Alias of array_intersect.

list_join

Alias of array_to_string.

list_length

Alias of array_length.

list_max

Alias of array_max.

list_ndims

Alias of array_ndims.

list_pop_back

Alias of array_pop_back.

list_pop_front

Alias of array_pop_front.

list_position

Alias of array_position.

list_positions

Alias of array_positions.

list_prepend

Alias of array_prepend.

list_push_back

Alias of array_append.

list_push_front

Alias of array_prepend.

list_remove

Alias of array_remove.

list_remove_all

Alias of array_remove_all.

list_remove_n

Alias of array_remove_n.

list_repeat

Alias of array_repeat.

list_replace

Alias of array_replace.

list_replace_all

Alias of array_replace_all.

list_replace_n

Alias of array_replace_n.

list_resize

Alias of array_resize.

list_reverse

Alias of array_reverse.

list_slice

Alias of array_slice.

list_sort

Alias of array_sort.

list_to_string

Alias of array_to_string.

list_union

Alias of array_union.

list_zip

Alias of arrays_zip.

make_array

Returns an array using the specified input expressions.

sql
make_array(expression1[, ..., expression_n])

Arguments

  • expression_n: Expression to include in the output array. Can be a constant, column, or function, and any combination of arithmetic or string operators.

Example

sql
> select make_array(1, 2, 3, 4, 5);
+----------------------------------------------------------+
| make_array(Int64(1),Int64(2),Int64(3),Int64(4),Int64(5)) |
+----------------------------------------------------------+
| [1, 2, 3, 4, 5]                                          |
+----------------------------------------------------------+

Aliases

  • make_list

make_list

Alias of make_array.

range

Returns an Arrow array between start and stop with step. The range start..end contains all values with start <= x < end. It is empty if start >= end. Step cannot be 0.

sql
range(stop)
range(start, stop[, step])

Arguments

  • start: Start of the range. Ints, timestamps, dates or string types that can be coerced to Date32 are supported.
  • end: End of the range (not included). Type must be the same as start.
  • step: Increase by step (cannot be 0). Steps less than a day are supported only for timestamp ranges.

Example

sql
> select range(2, 10, 3);
+-----------------------------------+
| range(Int64(2),Int64(10),Int64(3))|
+-----------------------------------+
| [2, 5, 8]                         |
+-----------------------------------+

> select range(DATE '1992-09-01', DATE '1993-03-01', INTERVAL '1' MONTH);
+--------------------------------------------------------------------------+
| range(DATE '1992-09-01', DATE '1993-03-01', INTERVAL '1' MONTH)          |
+--------------------------------------------------------------------------+
| [1992-09-01, 1992-10-01, 1992-11-01, 1992-12-01, 1993-01-01, 1993-02-01] |
+--------------------------------------------------------------------------+

string_to_array

Splits a string into an array of substrings based on a delimiter. Any substrings matching the optional null_str argument are replaced with NULL.

sql
string_to_array(str, delimiter[, null_str])

Arguments

  • str: String expression to split.
  • delimiter: Delimiter string to split on.
  • null_str: Substring values to be replaced with NULL.

Example

sql
> select string_to_array('abc##def', '##');
+-----------------------------------+
| string_to_array(Utf8('abc##def'))  |
+-----------------------------------+
| ['abc', 'def']                    |
+-----------------------------------+
> select string_to_array('abc def', ' ', 'def');
+---------------------------------------------+
| string_to_array(Utf8('abc def'), Utf8(' '), Utf8('def')) |
+---------------------------------------------+
| ['abc', NULL]                               |
+---------------------------------------------+

Aliases

  • string_to_list

string_to_list

Alias of string_to_array.

Struct Functions

named_struct

Returns an Arrow struct using the specified name and input expressions pairs.

sql
named_struct(expression1_name, expression1_input[, ..., expression_n_name, expression_n_input])

Arguments

  • expression_n_name: Name of the column field. Must be a constant string.
  • expression_n_input: Expression to include in the output struct. Can be a constant, column, or function, and any combination of arithmetic or string operators.

Example

For example, this query converts two columns a and b to a single column with a struct type of fields field_a and field_b:

sql
> select * from t;
+---+---+
| a | b |
+---+---+
| 1 | 2 |
| 3 | 4 |
+---+---+
> select named_struct('field_a', a, 'field_b', b) from t;
+-------------------------------------------------------+
| named_struct(Utf8("field_a"),t.a,Utf8("field_b"),t.b) |
+-------------------------------------------------------+
| {field_a: 1, field_b: 2}                              |
| {field_a: 3, field_b: 4}                              |
+-------------------------------------------------------+

row

Alias of struct.

struct

Returns an Arrow struct using the specified input expressions optionally named. Fields in the returned struct use the optional name or the cN naming convention. For example: c0, c1, c2, etc.

sql
struct(expression1[, ..., expression_n])

Arguments

  • expression1, expression_n: Expression to include in the output struct. Can be a constant, column, or function, any combination of arithmetic or string operators.

Example

For example, this query converts two columns a and b to a single column with a struct type of fields field_a and c1:

sql
> select * from t;
+---+---+
| a | b |
+---+---+
| 1 | 2 |
| 3 | 4 |
+---+---+

-- use default names `c0`, `c1`
> select struct(a, b) from t;
+-----------------+
| struct(t.a,t.b) |
+-----------------+
| {c0: 1, c1: 2}  |
| {c0: 3, c1: 4}  |
+-----------------+

-- name the first field `field_a`
select struct(a as field_a, b) from t;
+--------------------------------------------------+
| named_struct(Utf8("field_a"),t.a,Utf8("c1"),t.b) |
+--------------------------------------------------+
| {field_a: 1, c1: 2}                              |
| {field_a: 3, c1: 4}                              |
+--------------------------------------------------+

Aliases

  • row

Map Functions

element_at

Alias of map_extract.

map

Returns an Arrow map with the specified key-value pairs.

The make_map function creates a map from two lists: one for keys and one for values. Each key must be unique and non-null.

sql
map(key, value)
map(key: value)
make_map(['key1', 'key2'], ['value1', 'value2'])

Arguments

  • key: For map: Expression to be used for key. Can be a constant, column, function, or any combination of arithmetic or string operators. For make_map: The list of keys to be used in the map. Each key must be unique and non-null.
  • value: For map: Expression to be used for value. Can be a constant, column, function, or any combination of arithmetic or string operators. For make_map: The list of values to be mapped to the corresponding keys.

Example

sql
-- Using map function
SELECT MAP('type', 'test');
----
{type: test}

SELECT MAP(['POST', 'HEAD', 'PATCH'], [41, 33, null]);
----
{POST: 41, HEAD: 33, PATCH: NULL}

SELECT MAP([[1,2], [3,4]], ['a', 'b']);
----
{[1, 2]: a, [3, 4]: b}

SELECT MAP { 'a': 1, 'b': 2 };
----
{a: 1, b: 2}

-- Using make_map function
SELECT MAKE_MAP(['POST', 'HEAD'], [41, 33]);
----
{POST: 41, HEAD: 33}

SELECT MAKE_MAP(['key1', 'key2'], ['value1', null]);
----
{key1: value1, key2: }

map_entries

Returns a list of all entries in the map.

sql
map_entries(map)

Arguments

  • map: Map expression. Can be a constant, column, or function, and any combination of map operators.

Example

sql
SELECT map_entries(MAP {'a': 1, 'b': NULL, 'c': 3});
----
[{'key': a, 'value': 1}, {'key': b, 'value': NULL}, {'key': c, 'value': 3}]

SELECT map_entries(map([100, 5], [42, 43]));
----
[{'key': 100, 'value': 42}, {'key': 5, 'value': 43}]

map_extract

Returns a list containing the value for the given key or an empty list if the key is not present in the map.

sql
map_extract(map, key)

Arguments

  • map: Map expression. Can be a constant, column, or function, and any combination of map operators.
  • key: Key to extract from the map. Can be a constant, column, or function, any combination of arithmetic or string operators, or a named expression of the previously listed.

Example

sql
SELECT map_extract(MAP {'a': 1, 'b': NULL, 'c': 3}, 'a');
----
[1]

SELECT map_extract(MAP {1: 'one', 2: 'two'}, 2);
----
['two']

SELECT map_extract(MAP {'x': 10, 'y': NULL, 'z': 30}, 'y');
----
[]

Aliases

  • element_at

map_keys

Returns a list of all keys in the map.

sql
map_keys(map)

Arguments

  • map: Map expression. Can be a constant, column, or function, and any combination of map operators.

Example

sql
SELECT map_keys(MAP {'a': 1, 'b': NULL, 'c': 3});
----
[a, b, c]

SELECT map_keys(map([100, 5], [42, 43]));
----
[100, 5]

map_values

Returns a list of all values in the map.

sql
map_values(map)

Arguments

  • map: Map expression. Can be a constant, column, or function, and any combination of map operators.

Example

sql
SELECT map_values(MAP {'a': 1, 'b': NULL, 'c': 3});
----
[1, , 3]

SELECT map_values(map([100, 5], [42, 43]));
----
[42, 43]

Hashing Functions

digest

Computes the binary hash of an expression using the specified algorithm.

sql
digest(expression, algorithm)

Arguments

  • expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • algorithm: String expression specifying algorithm to use. Must be one of:
    • md5
    • sha224
    • sha256
    • sha384
    • sha512
    • blake2s
    • blake2b
    • blake3

Example

sql
> select digest('foo', 'sha256');
+------------------------------------------------------------------+
| digest(Utf8("foo"),Utf8("sha256"))                               |
+------------------------------------------------------------------+
| 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae |
+------------------------------------------------------------------+

md5

Computes an MD5 128-bit checksum for a string expression.

sql
md5(expression)

Arguments

  • expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select md5('foo');
+----------------------------------+
| md5(Utf8("foo"))                 |
+----------------------------------+
| acbd18db4cc2f85cedef654fccc4a4d8 |
+----------------------------------+

sha224

Computes the SHA-224 hash of a binary string.

sql
sha224(expression)

Arguments

  • expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select sha224('foo');
+----------------------------------------------------------+
| sha224(Utf8("foo"))                                      |
+----------------------------------------------------------+
| 0808f64e60d58979fcb676c96ec938270dea42445aeefcd3a4e6f8db |
+----------------------------------------------------------+

sha256

Computes the SHA-256 hash of a binary string.

sql
sha256(expression)

Arguments

  • expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select sha256('foo');
+------------------------------------------------------------------+
| sha256(Utf8("foo"))                                              |
+------------------------------------------------------------------+
| 2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae |
+------------------------------------------------------------------+

sha384

Computes the SHA-384 hash of a binary string.

sql
sha384(expression)

Arguments

  • expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select sha384('foo');
+--------------------------------------------------------------------------------------------------+
| sha384(Utf8("foo"))                                                                              |
+--------------------------------------------------------------------------------------------------+
| 98c11ffdfdd540676b1a137cb1a22b2a70350c9a44171d6b1180c6be5cbb2ee3f79d532c8a1dd9ef2e8e08e752a3babb |
+--------------------------------------------------------------------------------------------------+

sha512

Computes the SHA-512 hash of a binary string.

sql
sha512(expression)

Arguments

  • expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
> select sha512('foo');
+----------------------------------------------------------------------------------------------------------------------------------+
| sha512(Utf8("foo"))                                                                                                              |
+----------------------------------------------------------------------------------------------------------------------------------+
| f7fbba6e0636f890e56fbbf3283e524c6fa3204ae298382d624741d0dc6638326e282c41be5e4254d8820772c5518a2c5a8c0c7f7eda19594a7eb539453e1ed7 |
+----------------------------------------------------------------------------------------------------------------------------------+

Union Functions

Functions to work with the union data type, also know as tagged unions, variant types, enums or sum types. Note: Not related to the SQL UNION operator

union_extract

Returns the value of the given field in the union when selected, or NULL otherwise.

sql
union_extract(union, field_name)

Arguments

  • union: Union expression to operate on. Can be a constant, column, or function, and any combination of operators.
  • field_name: String expression to operate on. Must be a constant.

Example

sql
select union_column, union_extract(union_column, 'a'), union_extract(union_column, 'b') from table_with_union;
+--------------+----------------------------------+----------------------------------+
| union_column | union_extract(union_column, 'a') | union_extract(union_column, 'b') |
+--------------+----------------------------------+----------------------------------+
| {a=1}        | 1                                |                                  |
| {b=3.0}      |                                  | 3.0                              |
| {a=4}        | 4                                |                                  |
| {b=}         |                                  |                                  |
| {a=}         |                                  |                                  |
+--------------+----------------------------------+----------------------------------+

union_tag

Returns the name of the currently selected field in the union

sql
union_tag(union_expression)

Arguments

  • union: Union expression to operate on. Can be a constant, column, or function, and any combination of operators.

Example

sql
select union_column, union_tag(union_column) from table_with_union;
+--------------+-------------------------+
| union_column | union_tag(union_column) |
+--------------+-------------------------+
| {a=1}        | a                       |
| {b=3.0}      | b                       |
| {a=4}        | a                       |
| {b=}         | b                       |
| {a=}         | a                       |
+--------------+-------------------------+

Other Functions

arrow_cast

Casts a value to a specific Arrow data type.

sql
arrow_cast(expression, datatype)

Arguments

  • expression: Expression to cast. The expression can be a constant, column, or function, and any combination of operators.
  • datatype: Arrow data type name to cast to, as a string. The format is the same as that returned by [arrow_typeof]

Example

sql
> select
  arrow_cast(-5,    'Int8') as a,
  arrow_cast('foo', 'Dictionary(Int32, Utf8)') as b,
  arrow_cast('bar', 'LargeUtf8') as c;

+----+-----+-----+
| a  | b   | c   |
+----+-----+-----+
| -5 | foo | bar |
+----+-----+-----+

> select
  arrow_cast('2023-01-02T12:53:02', 'Timestamp(µs, "+08:00")') as d,
  arrow_cast('2023-01-02T12:53:02', 'Timestamp(µs)') as e;

+---------------------------+---------------------+
| d                         | e                   |
+---------------------------+---------------------+
| 2023-01-02T12:53:02+08:00 | 2023-01-02T12:53:02 |
+---------------------------+---------------------+

arrow_metadata

Returns the metadata of the input expression. If a key is provided, returns the value for that key. If no key is provided, returns a Map of all metadata.

sql
arrow_metadata(expression[, key])

Arguments

  • expression: The expression to retrieve metadata from. Can be a column or other expression.
  • key: Optional. The specific metadata key to retrieve.

Example

sql
> select arrow_metadata(col) from table;
+----------------------------+
| arrow_metadata(table.col)  |
+----------------------------+
| {k: v}                     |
+----------------------------+
> select arrow_metadata(col, 'k') from table;
+-------------------------------+
| arrow_metadata(table.col, 'k')|
+-------------------------------+
| v                             |
+-------------------------------+

arrow_typeof

Returns the name of the underlying Arrow data type of the expression.

sql
arrow_typeof(expression)

Arguments

  • expression: Expression to evaluate. The expression can be a constant, column, or function, and any combination of operators.

Example

sql
> select arrow_typeof('foo'), arrow_typeof(1);
+---------------------------+------------------------+
| arrow_typeof(Utf8("foo")) | arrow_typeof(Int64(1)) |
+---------------------------+------------------------+
| Utf8                      | Int64                  |
+---------------------------+------------------------+

get_field

Returns a field within a map or a struct with the given key. Supports nested field access by providing multiple field names. Note: most users invoke get_field indirectly via field access syntax such as my_struct_col['field_name'] which results in a call to get_field(my_struct_col, 'field_name'). Nested access like my_struct['a']['b'] is optimized to a single call: get_field(my_struct, 'a', 'b').

sql
get_field(expression, field_name[, field_name2, ...])

Arguments

  • expression: The map or struct to retrieve a field from.
  • field_name: The field name(s) to access, in order for nested access. Must evaluate to strings.

Example

sql
> -- Access a field from a struct column
> create table test( struct_col) as values
    ({name: 'Alice', age: 30}),
    ({name: 'Bob', age: 25});
> select struct_col from test;
+-----------------------------+
| struct_col                  |
+-----------------------------+
| {name: Alice, age: 30}      |
| {name: Bob, age: 25}        |
+-----------------------------+
> select struct_col['name'] as name from test;
+-------+
| name  |
+-------+
| Alice |
| Bob   |
+-------+

> -- Nested field access with multiple arguments
> create table test(struct_col) as values
    ({outer: {inner_val: 42}});
> select struct_col['outer']['inner_val'] as result from test;
+--------+
| result |
+--------+
| 42     |
+--------+

version

Returns the version of DataFusion.

sql
version()

Example

sql
> select version();
+--------------------------------------------+
| version()                                  |
+--------------------------------------------+
| Apache DataFusion 42.0.0, aarch64 on macos |
+--------------------------------------------+