rfcs/2020-04-15-2341-wasm-plugins.md
Introduce plugin support to Vector, providing a way for users to add custom functionality to Vector. This RFC introduces WASM plugins specifically, but we envision the possibility of other implementations that may reuse ideas from this RFC. This RFC introduces Transforms and will be later amended with Source and Sink functionality.
There is a plurality of reasons for us to wish for flexible, practical plugin support to Vector. These may also be regarded as 'Foreign Modules' or 'Extensions.' WASM is the most versatile and obvious choice for an implementation as it can be targeted by a number of languages and is intended to be OS, arch, and platform independent.
Since the concept of a plugin is a abstract, this section will instead discuss a number of seemingly unrelated topics, all asking for different features. While plugin support does not resolve all these problems, it gives us or our users a path to a solution.
We noted in discussions with several current and prospective users that custom protobuf support in the form of an
encoding option or a transform.
Protobufs in their proto form aren't usable by the prevailing Rust Protobuf libraries in a dynamic way. Both
prost and rust-protobuf focus almost solely on compiling support for specific Protobufs into a Rust program.
While a serde-protobuf crate exists and can do dynamic decoding, we found it to be an order of magnitude slower than
the other two options, we also noted it lacked serialization support.
While evaluating the possibility of adding serialization we noted that any implementation that operated dynamically would be quite slow. Notably, it would it would seriously impact overall event processing infrastructure if Vector was responsible for a slowdown. These organizations are choosing to use protobufs usually for speed and/or compatibility guarantees.
We should enable and encourage these goals, our users should be empowered to capture their events, and not worried about processing load. So the question became: How do we let users opt-in to custom protobuf support efficiently and without great pain?
Recently, we've been discussing the trade offs associated with different code modularization techniques. We currently make fairly heavy use of the Rust feature flag system.
This is great because it enables us to build in multi-platform support as well as specialized (eg only 1 source) builds. We also noted builds with only 1 or two feature flags are significantly faster than builds with all features supported.
One downside is our current strategy can require tests (See the make check-features task) to ensure their correctness.
The other is that we sometimes need to spend a lot of time weaving through cfg(feature = "..") flags.
Having some solution to cleanly package independent modules of functionality would allow us to have a core Vector which builds fast while keeping modularity features.
Vector currently bundles a lua transform, and there is ongoing
work on a javascript transform.
Currently, each language requires as separate runtime, and possibly separate integration subtleties like needing to call a GC. It would be highly convenient if we could focus our efforts on one consistent API for all languages, and spend this time saved providing a better, faster, safer experience.
As Vector grows and supports more integrations, our binary size and build complexity will inevitable grow. In informal discussions we've previously noted the module system of Terraform. This kind of architecture can help us simplify the codebase, reduce binary sizes, formalize our APIs, and open Vector up to a third-party module ecosystem.
Terraform typically runs on CIs and developer machines, making portability valuable. For Terraform, users can follow the guide to write a Go-based provider, which they can then distribute as a portable binary. Notably: This means either distributing an unoptimized binary, distributing a lot of optimized binaries (per architecture), or requiring folks build their own optimized binaries. Terraform doesn't care much about speed, so unoptimized binaries are acceptable.
Vector is in a slightly different position than Terraform though! Vector runs primarily in servers and even end user machines. We can't expect users to have build tooling installed on their servers (it's a security risk!) and we definitely can't expect it on end-user machines. Most people just aren't that interested in computers. Vector has different performance needs, too. While folks aren't generally wanting to execute terraform providers hundreds of thousands (or millions) of times per second, they are absolutely doing that with Vector.
When we're processing the firehose of events originating from a modern infrastructure every millisecond counts. Vector needs a way to ship portable, optimizable plugins if we ever hope of making this a reality.
We would like to avoid asking users to chain together a number of transforms to accomplish simple tasks. This is evidenced by issues like #750, #1653, and #1926. Having to, for example, use two transforms just to add 1 field and remove another is very verbose.
We noted that the existing lua runtime was able to accomplish these tasks quite elegantly, however it was an order of magnitude slower than a native transform.
Users shouldn't pay a high price just for a few lines saved in a configuration file. They shouldn't feel frustration when building these kinds of pipelines either.
We've discussed various refinements to our pipelines such as Pipelines Proposal V2 and Proposal: Compose Transform. Some of these discussions have been about improving our TOML syntax, others about new syntax entirely, and others about how we could let users write in the language of their choice.
The path forward is unclear, but if we choose to adopt something new we must focus on providing a good user experience as well as performance. Having a fast, simple, portable compile target could make this an easier effort.
We exist in an ecosystem, Tremor exists and we'd really love to be able to work in harmony somehow. While both tools process events, Vector focuses on acting as a host-level or container-level last/first mile router, Tremor focuses on servicing demanding workloads from it's position as an infrastructure-wide service.
What if we could provide users of Tremor and Vector with some sort of familiar shared experience? What if we could share functionality? What kind of framework could satisfy both our needs? There are so many questions!
We are in communication with them and this RFC will amended in the future to include cross compatibility details if any emerge.
We have an existing lua runtime existing as transform, and there is currently an
issue regarding an eventual javascript transform. Neither of these
features reach the scope of this proposal, still, it is valuable to learn.
While evaluating the lua transform to solve the Protobuf problem described in Motivations, we benchmarked an
implementation of add_fields that was already performing at approximately the same performance as the serde-protobuf
solution.
We also noted that the existing lua v1 transform was quite simple. While this makes for nice usability, it means
things like on-start initializations aren't possible. In v2 lua has
learnt many new tricks, and this WASM support shall match them.
We also noted the capabilities and speed of WASM (particularly WASI) mean we could eventually support things like sockets and files.
Indeed, it is possible we will be able to let Vector build and run Lua code as a WASM module in the future.
Vector contains a WebAssembly (WASM) engine which allows you to write custom functionality for Vector. You might already be familiar with the Lua transform, WebAssembly is a bit different.
While it enables more functionality and possibly better speeds, it also introduces complication to your system. You'll need to learn a bit about Foreign Function Interfaces (FFI), a bit about memory management, and build your WASM artifact. Don't worry: We'll walk you through it.
It's a new frontier, and you've got a friend in Vector.
Using the WASM engine you can write your own transforms, sinks, codecs in any language that compiles down to a
.wasm or .wat file. Vector can then compile this file down to an optimized, platform specific module which it can
use. This allows Vector to work with internal or uncommon protocols or services, support new language transforms, or
just write a special transform to do exactly what you want. With the WASM engine, Vector puts you in command.
There is a trade-off though! You'll need to embolden yourself, as WASM is a fledgling technology. When it first loads a WASM file, Vector needs to spend some time building an optimized module before it can load it. You'll also be responsible for the safety and correctness of your code, if your module crashes on an event, Vector will re-initialize the module and try again on the next event. We try to follow the ideas of Erlang's OTP.
Some examples of when a WASM plugin is a good fit:
Some examples of where WASM plugin probably isn't right for you:
Vector plugins are .wasm files (WASI or not) exposing a predefined interface. We provide a vector_wasm
crate for Rust projects located in lib/vector-wasm of the source tree, and we will be providing bindings for
other languages as we see user demand.
Plugins are sandboxed and can only interact with Vector over the Foreign Function Interface (FFI) which Vector exposes. For each instance of a module, Vector holds some context. For example, when a transform module processes an event, the context of the module contains an event, as well as the configuration used to create the transform.
A guest module is not able to access the memory of the Vector instance, but Vector is free to modify the guest memory.
For example, behind the scenes of a get call, the Guest must allocate some memory for Vector to write the result into.
Due to the current semantics of the vector::Event type, passing data between WASM plugin and the Vector host involves
serialization. This may change in the future, resulting in an increase major version of the vector-wasm crate.
Regardless of whether of not the plugin uses WASI, or has access to a language specific binding, it can interact with
Vector. Vector works by invoking the plugins public interface. First, as soon as the module loads, register is
called once per module. In this call, the module can configure how Vector sees it and talks to it. During
this time, the module can do one time setup it needs. (Eg. Opening a socket, a file).
If using the Vector guest API this can be done via:
use vector_wasm::{Registration, roles::Sink};
#[no_mangle]
// TODO: This is unsafe -- Needs a fix.
pub extern "C" fn init() {
Registration::transform()
.register()
}
What happens next depends on which type of module it is.
Transform: Modules of this role can be used as a transform type.
use vector_wasm::hostcall;
#[no_mangle]
pub extern "C" fn process(data: u64, length: u64) -> usize {
let data = unsafe {
std::ptr::slice_from_raw_parts_mut(data as *mut u8, length as usize)
.as_mut()
.unwrap()
};
let mut event: BTreeMap<String, Value> = serde_json::from_slice(data).unwrap();
event.insert("new_field".into(), "new_value".into());
event.insert("new_field_2".into(), "new_value_2".into());
hostcall::emit(serde_json::to_vec(&event).unwrap());
1
}
TODO: Sources, sinks, and codecs have no POC or formal specification.
To create your first module, start with a working Rust toolchain add the wasm32-wasi
toolchain:
rustup target add wasm32-wasi
Then, create a module:
cargo init --lib echo
In your Cargo.toml fill in:
[lib]
crate-type = ["cdylib"]
[dependencies]
# This will be published at a later date.
vector-wasm = { path = "/path/to/your/clone/of/vector/lib/vector-wasm"}
serde = { version = "1", features = ["derive"] }
Next, in your lib.rs file:
use serde_json::Value;
use std::collections::BTreeMap;
use vector_wasm::{hostcall, Registration};
#[no_mangle]
pub extern "C" fn init() -> *mut Registration {
&mut Registration::transform().set_wasi(true) as *mut Registration
}
#[no_mangle]
pub extern "C" fn process(data: u64, length: u64) -> usize {
let data = unsafe {
std::ptr::slice_from_raw_parts_mut(data as *mut u8, length as usize)
.as_mut()
.unwrap()
};
let mut event: BTreeMap<String, Value> = serde_json::from_slice(data).unwrap();
event.insert("new_field".into(), "new_value".into());
event.insert("new_field_2".into(), "new_value_2".into());
hostcall::emit(serde_json::to_vec(&event).unwrap());
1
}
#[no_mangle]
pub extern "C" fn shutdown() {
();
}
Now to build it:
cargo +nightly build --target wasm32-wasi --release
In your target/wasm-wasi/release/ verify that the echo.wasm file exists. Next edit your Vector config.toml:
data_dir = "/var/lib/vector/"
[sources.source0]
max_length = 102400
type = "stdin"
[transforms.demo]
inputs = ["source0"]
type = "wasm"
module = "target/wasm32-wasi/release/echo.wasm"
[sinks.sink0]
healthcheck = true
inputs = ["demo"]
type = "console"
encoding.codec = "json"
buffer.type = "memory"
buffer.max_events = 500
buffer.when_full = "block"
[[tests]]
name = "demo-tester"
[tests.input]
insert_at = "demo"
type = "log"
[tests.input.log_fields]
"message" = "foo"
[[tests.outputs]]
extract_from = "demo"
[[tests.outputs.conditions]]
"echo.equals" = "foo"
Now try vector test config.toml and Vector will go ahead and build an optimized cache/echo.so artifact then run it
for a unit test.
$ vector test config.toml
Running config.toml tests
test config.toml: demo-tester ... passed
Congratulations, you're ready for a new frontier!
Integrating a flexible, practical, simple WASM plugin support means we can:
WASM is still very young. We will be early adopters and will pay a high adoption price. Additionally, we may find
ourselves using unstable functionality of some crates, including a non-trivial amount of unsafe, which may lead to
unforeseen consequences.
Our binary size will bloom. We currently (in unoptimized POC implementations) see our binary size grow by over 60
MB. The Lucet team is aware of this issue, and in some preliminary discussions this binary size increase is unexpected and
we expect it to improve. If this does not improve, we can consider adopting wasmtime, lucet's sister JIT implementation
which has a smaller binary footprint.
Our team will need to diagnose, support, and evolve an arguably rather complex API for WASM modules to work correctly. This will mean we'll have to think of our system very abstractly, and opens many questions around API stability and compatibility.
This feature is advanced and may invoke user confusion. Novice or casual users may not fully grasp the subtleties and limitations of WASM. We must practice caution around the user experience of this feature to ensure novices and advanced users can understand what is happening in a Vector pipeline utilizing WASM modules.
Since this is a broad reaching feature with a number of green sky ideas, we should discuss these questions:
We should support plugins so broadly?
Our original discussions included the idea of a blessed compiler and a UX similar to our existing lua or javascript
transform. Indeed, our initial implementation includes just a transform, as it's by far the easiest place to
introduce this feature.
During investigation of the initial POC we determined that our most desirable "blessed" language would be AssemblyScript, unfortunately, it's currently not able to run inside Lucet. Work is ongoing and we expect it to be possible soon.
Temporary Conclusion: AssemblyScript and other languages we evaluated did not provide a way to package themselves as WASM modules, making it hard to integrate with them. We are planning to revisit this at a later date when the ecosystem is more mature.
We should consider if we are satisfied with the idea of hostcalls being used for get and other API. We could also let
the host call the Guest allocate function and then pass it a C String pointer to let it work on. This, however, requires
serializing and deserializing the entire Event each time, which is a huge performance bottleneck.
We could also consider adopting a new strategy for our event type. However, that work is outside the scope of this
RFC. The current structure of the Event type is already discussed in
#1891.
We should also consider if we want to change how we handle codecs, since it is likely that WASM module use cases will include wanting to add codec support to already existing sources.
Temporary Conclusion: We will adopt hostcalls for the time being to allow for a minimal POC. We would like to investigate these choices later, and we will make a decision before stabilizing WASM modules.
We will need to expose some Vector guest API to users. This will require ABI compatibility with Vector, so it makes sense to keep it in the same repository (and maybe the same crate?) as Vector. We can consider either making Vector able to be built by WASI targets (more feature flags) or using a workspace and having the guest API as a workspace member we publish.
A couple good questions to consider:
cargo install vector option for installing a Vector binary?vector::... to use the Vector guest API in their Rust module?Conclusion: We are including a workspace member vector-wasm in the vector-wasm directory.
This may be renamed and/or published in the future.
How can we let users see what's happening in WASM plugins? Can we use tracing somehow? Lucet supports tracing, perhaps we could hook in somehow?
Conclusion: Preliminary results show we may be able to allow guests to write with the current tracing subscriber.
There is ongoing work to support more platforms in Lucet, here are how things stand today:
Platform support:
Temporary Conclusion: The Lucet team desires to support other platforms in the near future. Since the lion's share
of Vector usage is on X86_64 Linux, and this platform is already supported, we decided to adopt Lucet for now. We could
consider adopting wasmtime in the future if lucet does not eventually reach it in terms of platform parity.
Incremental steps that execute this change.
wasm transform test to pass.Note: This can be filled out during the review process.