Back to Clickhouse

Functions for working with URLs

docs/en/sql-reference/functions/url-functions.md

26.4.1.1-new1.9 KB
Original Source

Functions for working with URLs

Overview {#overview}

:::note The functions mentioned in this section are optimized for maximum performance and for the most part do not follow the RFC-3986 standard. Functions which implement RFC-3986 have RFC appended to their function name and are generally slower. :::

You can generally use the non-RFC function variants when working with publicly registered domains that contain neither user strings nor @ symbols. The table below details which symbols in a URL can () or cannot () be parsed by the respective RFC and non-RFC variants:

Symbolnon-RFCRFC
' '
\t
<
>
%✔*
{
}
|
\\
^
~✔*
[
]
;✔*
=✔*
&✔*

symbols marked * are sub-delimiters in RFC 3986 and allowed for user info following the @ symbol.

There are two types of URL functions:

  • Functions that extract parts of a URL. If the relevant part isn't present in a URL, an empty string is returned.
  • Functions that remove part of a URL. If the URL does not have anything similar, the URL remains unchanged.

:::note The functions below are generated from the system.functions system table. :::

<!-- The inner content of the tags below are replaced at doc framework build time with docs generated from system.functions. Please do not modify or remove the tags. See: https://github.com/ClickHouse/clickhouse-docs/blob/main/contribute/autogenerated-documentation-from-source.md --> <!--AUTOGENERATED_START--> <!--AUTOGENERATED_END-->