Back to Elasticsearch

Using ingest processors in Painless [painless-ingest]

docs/reference/scripting-languages/painless/using-ingest-processors-in-painless.md

9.4.05.4 KB
Original Source

Using ingest processors in Painless [painless-ingest]

Painless scripts in ingest pipelines can access certain ingest processor functionality through the Processors namespace, enabling custom logic while leveraging {{es}} built-in transformations. Scripts execute within the ctx context to modify documents during ingestion.

Only a subset of ingest processors expose methods in the Processors namespace for use in Painless scripts. The following ingest processors expose methods in Painless:

  • Bytes
  • Lowercase
  • Uppercase
  • Json

When to choose each approach

MethodUse forProsCons
script processor (Painless)* complex logic
  • conditional operations
  • multi-field validation | * full control
  • custom business logic
  • cross-field operations | * performance overhead
  • complexity | | ingest processor | * common transformations
  • standard operations | * optimized performance
  • built-in validation
  • simple configuration | * limited logic
  • single-field focus | | runtime fields | * query-time calculations
  • schema flexibility | * no reindexing required * dynamic computation during queries | * query-time performance cost
  • read-only operations
  • not used in ingest pipeline. |

Performance considerations: Script processors can impact pipeline performance. Prefer ingest processors for simple transformations.

Method usage [_method_usage]

All ingest methods available in Painless are scoped to the Processors namespace. For example:

console
POST /_ingest/pipeline/_simulate?verbose
{
  "pipeline": {
    "processors": [
      {
        "script": {
          "lang": "painless",
          "source": """
            long bytes = Processors.bytes(ctx.size);
            ctx.size_in_bytes = bytes;
          """
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "size": "1kb"
      }
    }
  ]
}

Ingest methods reference [_ingest_methods_reference]

Byte conversion [_byte_conversion]

Use the bytes processor to return the number of bytes in the human-readable byte value supplied in the value parameter.

painless
long bytes(String value);

Lowercase conversion [_lowercase_conversion]

Use the lowercase processor to convert the supplied string in the value parameter to its lowercase equivalent.

painless
String lowercase(String value);

Uppercase conversion [_uppercase_conversion]

Use the uppercase processor to convert the supplied string in the value parameter to its uppercase equivalent.

painless
String uppercase(String value);

JSON parsing [_json_parsing]

Use the JSON processor to parse a string containing JSON data into a structured object, string, or other value. There are two json methods:

painless
void json(Map<String, Object> map, String key);
Object json(Object value);

The first json method accepts a map and a key. The processor parses the JSON string in the given map at the given key to a structured object. The entries in that object are added directly to the given map.

For example, if the input document looks like this:

js
{
  "foo": {
    "inputJsonString": "{\"bar\": 999}"
  }
}

% NOTCONSOLE

then executing this script:

painless
Processors.json(ctx.foo, 'inputJsonString');

will result in this document:

js
{
  "foo": {
    "inputJsonString": "{\"bar\": 999}",
    "bar" : 999
  }
}

% NOTCONSOLE

The second json method accepts a JSON string in the value parameter and returns a structured object or other value.

You can then add this object to the document through the context object:

painless
ctx.parsedJson = Processors.json(ctx.inputJsonString);

URL decoding [_url_decoding]

Use the URL decode processor to URL-decode the string supplied in the value parameter.

painless
String urlDecode(String value);

URI decomposition [_uri_decomposition]

Use the URI parts processor to decompose the URI string supplied in the value parameter. Returns a map of key-value pairs in which the key is the name of the URI component such as domain or path and the value is the corresponding value for that component.

painless
String uriParts(String value);

Network community ID [_network_community_id]

Use the community ID processor to compute the network community ID for network flow data.

painless
String communityId(String sourceIpAddrString, String destIpAddrString, Object ianaNumber, Object transport, Object sourcePort, Object destinationPort, Object icmpType, Object icmpCode, int seed)
String communityId(String sourceIpAddrString, String destIpAddrString, Object ianaNumber, Object transport, Object sourcePort, Object destinationPort, Object icmpType, Object icmpCode)