pkg/parser/README.md
![gopherbadger-tag-do-not-edit]
Parser is in charge of turning raw log lines into objects that can be manipulated by heuristics. Parsing has several stages represented by directories on config/stage. The alphabetical order dictates the order in which the stages/parsers are processed.
The runtime representation of a line being parsed (or an overflow) is an Event, and has fields that can be manipulated by user :
The Event structure goes through the stages, being altered with each parsing step. It's the same object that will be later poured into buckets.
A parser configuration is a Node object, that can contain grok patterns, enrichement instructions.
For example :
filter: "evt.Line.Labels.type == 'testlog'"
debug: true
onsuccess: next_stage
name: tests/base-grok
pattern_syntax:
MYCAP: ".*"
nodes:
- grok:
pattern: ^xxheader %{MYCAP:extracted_value} trailing stuff$
apply_on: Line.Raw
statics:
- meta: log_type
value: parsed_testlog
optional if present and prometheus or profiling are activated, stats will be generated for this node.
filter: "Line.Src endsWith '/foobar'"
filter : an expression that will be evaluated against the runtime of a line (Event)
filter is present and returns false, node is not evaluatedfilter is absent or present and returns true, node is evaluated
debug: true
debug : a bool that sets debug of the node to true (applies at runtime and configuration parsing)
onsuccess: next_stage|continue
next_stage make the line go to the next stage, while continue will continue processing the current stage.statics:
- meta: service
value: tcp
- meta: source_ip
expression: "Event['source_ip']"
- parsed: "new_connection"
expression: "Event['tcpflags'] contains 'S' ? 'true' : 'false'"
- target: Parsed.this_is_a_test
value: foobar
Statics apply when a node is considered successful, and are used to alter the Event structure.
An empty node, a node with a grok pattern that succeeded or an enrichment directive that worked are successful nodes.
Statics can :
Meta dictParsed dictGrok patterns are used to parse one field of Event into one or several others :
grok:
name: "TCPDUMP_OUTPUT"
apply_on: message
name is the name of a pattern loaded from patterns/.
Base patterns can be seen on the repo : https://github.com/crowdsecurity/grokky/blob/master/base.go
grok:
pattern: "^%{GREEDYDATA:request}\\?%{GREEDYDATA:http_args}$"
apply_on: request
pattern which is a valid pattern, optionally with an apply_on that indicates to which field it should be applied
Present at the Event level, the pattern_syntax is a list of subgroks to be declared.
pattern_syntax:
DIR: "^.*/"
FILE: "[^/].*$"
The Enrichment mechanism is exposed via statics :
statics:
- method: GeoIpCity
expression: Meta.source_ip
- meta: IsoCode
expression: Enriched.IsoCode
- meta: IsInEU
expression: Enriched.IsInEU
The GeoIpCity method is called with the value of Meta.source_ip.
Enrichment plugins can output one or more key:values in the Enriched map,
and it's up to the user to copy the relevant values to Meta or such.
The Node object allows as well a nodes entry, which is a list of Node entries, allowing you to build trees.
filter: "Event['program'] == 'nginx'" #A
nodes: #A'
- grok: #B
name: "NGINXACCESS"
# this statics will apply only if the above grok pattern matched
statics: #B'
- meta: log_type
value: "http_access-log"
- grok: #C
name: "NGINXERROR"
statics:
- meta: log_type
value: "http_error-log"
statics: #D
- meta: service
value: http
The evaluation process of a node is as follows:
filter (A), if it doesn't match, exitgrok entry is present, process it
grok entry returned data, apply the local statics of the node (if the grok 'B' was successful, apply B' statics)nodes or the grok was successful, apply the statics (D)Main structs :
Main funcs :