website/content/en/highlights/2022-03-22-0-21-0-upgrade-guide.md
Vector's 0.21.0 release includes breaking changes:
patterns changed to outputsPatternsEventNotification type changesvector vrl timezone flag -tz is now -zvector top human_metrics flag -h is now -Hevent_discarded_total removedbuffer_discarded_events_total now includes received eventskubernetes_logs source rewritten to use kube-rsAnd deprecations:
receivedEventsTotal, sentEventsTotal, sentEventsThroughput, receivedEventsThroughput have been deprecatedWe cover them below to help you upgrade quickly:
Previously, there were two different ways to describe paths. VRL uses a newer syntax, while everything else in Vector still used an older syntax. This was a constant pain point for users, and we have taken some steps to migrate towards the VRL syntax. This is a breaking change that may require migration.
The old syntax was very lenient in the characters that were allowed in a field name. It also supported single character escapes.
The new syntax only allows A-Z a-z 0-9 _ @. Any other character will require the field name to be quoted.
Quotes around a field name replace single character escaping. This brings the old syntax in line with the newer (VRL) syntax. Note that
VRL makes a distinction between a field starting with a "." (event query) and without (variable query). Outside a VRL context,
the "." is optional and ignored.
Migration will be required for any paths used outside a VRL context. That is any transform (except remap and conditions), templating, and any source or sink referencing field names. There are no changes to the VRL syntax.
Here are some examples that require migrating
| old syntax | new syntax | comment |
|---|---|---|
| foo\.bar.baz | "foo.bar".baz | The . field separator needs to be escaped if used as part of a field name. The old syntax allowed individual character escaping. The new syntax requires quotes around the field name. |
| headers.User-Agent | headers."User-Agent" | - requires quotes with the new syntax |
| foo with spaces | "foo with spaces" | Spaces also need to be quoted |
| foo\"bar | "foo\"bar" | Double quotes and backlashes must be escaped inside quotes |
Old syntax
transforms:
dedupe:
type: "dedupe"
inputs: ["input"]
fields:
match: ["message.user-identifier"]
New syntax (the dash requires the field name to be quoted)
transforms:
dedupe:
type: "dedupe"
inputs: ["input"]
fields:
match: ['message."user-identifier"']
For more information on the new syntax, you can review the VRL path expressions documentation
outputEventsByComponentIdPatterns subscription argument patterns changed to outputsPatterns {#api-patterns-to-outputspatterns}To avoid confusion and align with the new inputsPatterns argument, we've
renamed the original patterns argument to outputsPatterns. outputsPatterns
allows you to specify patterns that will match against components (sources,
transforms) and display their outflowing events. inputsPatterns allows you
to specify patterns that match against components (transforms, sinks) and
display their incoming events.
Note that using an input pattern to match a component is effectively a shorthand. It's the same as using one or more output patterns to match against all the outputs flowing into a component.
Updating your subscriptions is as simple as renaming patterns to
outputsPatterns.
- subscription {
- outputEventsByComponentIdPatterns(patterns: [...])
+ subscription {
+ outputEventsByComponentIdPatterns(outputsPatterns: [...])
EventNotification type changes {#api-event-notification-changes}As part of adding a new notification (InvalidMatch) to warn users against
attempting invalid matches, we've reshaped the EventNotification type for
easier querying and future extensibility.
Previously, the EventNotification type consisted simply of a pattern and
plain enum describing the notification.
type EventNotification {
pattern: String!
notification: EventNotificationType!
}
While this worked well for simple notifications like Matched and NotMatched,
it was awkward to extend to new notifications, like InvalidMatch, which may
want to include more information than pattern. Thus, the new
EventNotification type takes the following form:
type EventNotification {
notification: Notification!
message: String!
}
where Notification is a union of specific kinds of notifications:
union Notification = Matched | NotMatched | InvalidMatch
message is a new human-readable description of the notification while
notification contains additional details specific to the kind of notification.
All the same information is still available, and the following example shows how
you might convert an existing query to the new schema.
subscription {
- outputEventsByComponentIdPatterns(patterns: [...]) {
+ outputEventsByComponentIdPatterns(outputsPatterns: [...]) {
__typename
... on EventNotification {
- pattern
- notification
+ message
+ notification {
+ __typename
+ ... on Matched {
+ pattern
+ }
+ ... on NotMatched {
+ pattern
+ }
+ ... on InvalidMatch {
+ pattern
+ invalidMatches
+ }
+ }
}
}
}
}
The following deprecated subscriptions have been removed in this release. Please use the listed alternatives.
eventsInTotal: use componentReceivedEventsTotalseventsOutTotal: use componentSentEventsTotalscomponentEventsInThroughputs: use componentReceivedEventsThroughputscomponentEventsInTotals: use componentReceivedEventsTotalscomponentEventsOutThroughputs: use componentSentEventsThroughputscomponentEventsOutTotals: use componentSentEventsTotalseventsInThroughput: use componentReceivedEventsThroughputseventsOutThroughput: use componentSentEventsThroughputsvector vrl timezone flag -tz is now -z {#vector-vrl-timezone}We upgraded the Vector CLI to use Clap v3, a popular Rust crate.
A breaking change in Clap v3 is that shortened CLI flags now use the char
type, meaning they are restricted to single characters.
As a consequence, the shortened form of our vector vrl --timezone flag
(previously --tz) has been updated to the more succinct -z.
vector top human_metrics short flag -h is now -H {#vector-top-human-metrics}To avoid clashing and issues with our upgrade to Clap
v3, the short -h from --help and -h from
--human_metrics in the vector top command have been disambiguated. The
shortened form for --human_metrics is now -H and -h is reserved for
--help.
The remainder operator in VRL has become a fallible operation. This is because finding the remainder with a divisor of zero can raise an error that needs to be handled.
Before this VRL would compile:
.remainder = 50 % .value
If .value was zero, Vector would panic. This can be fixed by handling the error:
.remainder = 50 % .value ?? 0
We have migrated sources and sinks that utilize AWS to the official AWS SDK (from Rusoto). This comes with some benefits such as support for IMDSv2. This requires us to deprecate some config options.
The new AWS SDK lacks support for certain authentication configuration:
auth.credential_file option was removed as the new SDK does not yet support it
it. It is still possible to use a credentials file, but it should
be placed in the default location (~/.aws/credentials on Linux, OS X, and Unix; %userprofile%\.aws\credentials on
Microsoft Windows), or the location should be set with an environment variable (AWS_CONFIG_FILE or
AWS_SHARED_CREDENTIALS_FILE).credential_process in an AWS profile was dropped as it is not yet supported by the new
SDK.Specifying a region is now also required. Make sure a region is specified in either the AWS config file, or the Vector config.
The assume_role config option was deprecated and moved to auth.assume_role previously. This deprecated option has
now been removed.
This affects the following components:
For more details on configuring auth, you can visit these links:
event_discarded_total removed {#transform-route-metric}Until now, when using the route transform, if an event didn't match any configured route, this event would be
discarded and lost for the following transforms and sinks.
A new _unmatched route has now been introduced and the events are no longer discarded, making the event_discarded_total metric irrelevant so it has been dropped.
You can still get the total number of events that match no routes via component_events_sent_total with a tag of output=_unmatched.
buffer_discarded_events_total now includes received events {#buffer-discarded-events}The buffer_discarded_events_total now includes all events flowing into
a buffer, even those discarded when the buffer is full and drop_newest is
configured as the when_full behavior.
This brings the metric inline with the component-level received and discarded
metrics where events are counted as received before being discarded as well as leaving
the door open for additional discard strategies like drop_oldest where events
would live in the buffer before possibly being discarded.
kubernetes_logs source rewritten to use kube-rs {#kubernetes-logs}The kubernetes_logs source has had two breaking changes:
list verb for Vector's ClusterRole resource. If you
are using the Helm chart, version 0.7.0 includes this change. Otherwise,
make sure to add it to your manifest.proxy configuration was dropped. Instead, configure any needed proxy
configuration in your kubeconfig.See the highlight for more information.
Previously /var/lib/vector was defined as a volume in the Dockerfiles for
the published Vector images. This led to the creation of the volume each time
you ran a Vector container from these images whether you wanted it or not.
Instead, if you need a volume for the data directory, you should provide one when launching the container.
When migrating from an earlier version to 0.21.0 or later using Docker compose and implicit volumes, you need to use docker inspect to find out which volumes your container is mapped to so that you can map them to the upgraded container as well.
See vectordotdev/vector#11982 for additional rationale.
In preparation for VRL iteration support landing in the next release, this release of Vector includes a breaking change to the way variable scoping works.
Specifically, variables defined in nested blocks cannot be accessed by parent blocks.
This is best explained with an example:
# top-level scope
count1 = 1
# nested block
{
count2 = 1
count1 = count1 + 1
# nested block
{
count2 = count2 + 1
count1 = count1 + 1
}
}
count1 # returns ”3”
count2 # returns a compile-time error, because ”count2” was defined in a nested block
When using CLI options that can take multiple values, the provided values must be comma separated. For example:
vector --config foo.toml,bar.toml
Additionally when passing values that contain wildcards (*), these values
must be quoted. For example:
vector --config "*.toml"
The --watch-config option previously required a boolean value, which is no
longer needed. For example, in earlier releases:
vector --watch-config=true
This should become:
vector --watch-config
The introduction of lexical scoping is important for when iteration support lands, which (as a sneak preview) allows you to do the following:
data = { ”foo”: 1, ”bar”: 2 }
map(data) -> |key, value| {
new_key = upcase(key)
[new_key, value + 1]
}
data # returns { ”FOO”: 2, ”BAR”: 3 }
new_key # returns a compile-time error, because ”new_key” is a variable scoped
# to the enumeration closure block
Without lexical scoping, it would be ambiguous what new_key should return in
the last example, but now it’s clear that the variable remains undefined outside
of the closure block.
receivedEventsTotal, sentEventsTotal, sentEventsThroughput, receivedEventsThroughput have been deprecated {#deprecate-aggregate-subscriptions}While these subscriptions were intended to display aggregate metrics across all components, they currently only show a per-component metric and are made redundant by more informative subscriptions that include specific component information. To avoid misuse and confusion, we are deprecating them in favor of the following alternatives.
receivedEventsTotal: use componentReceivedEventsTotalssentEventsTotal: use componentSentEventsTotalssentEventsThroughput: use componentSentEventsThroughputsreceivedEventsThroughput: use componentReceivedEventsThroughputsCurrently, end-to-end acknowledgements are opt-in at the source-level
via the acknowledgements.enabled setting. This made sense initially
since sources are the ones that are acknowledging back to clients, but
makes it difficult to achieve durability. Durability, which is the
primary goal of acknowledgements, is sink-dependent instead of
source-dependent. That is, it is important to assert that all data
going to a system of record is fully acknowledged, for example, for
all the sources that it came from, this guaranteeing delivery to that
destination.
To achieve this, an acknowledgements option has been added to
sinks. When the configuration is loaded, all sources that are
connected to a sink that has this option enabled will automatically be
configured to wait for sinks to acknowledge before issuing their own
acknowledgements (where possible).
The source configuration acknowledgements option will remain in this
version, but is deprecated and will be removed in a future version.
See the documentation for end-to-end acknowledgements for more details on the acknowledgement process.