docs/design/02-wasm-worker-pthread-compat.md
Wasm Workers in Emscripten are a lightweight alternative to pthreads. They use
the same memory and can use the same synchronization primitives, but they do not
have a full struct pthread and thus many pthread-based APIs (like
pthread_self()) currently do not work when called from a Wasm Worker.
This is not an issue in pure Wasm Workers programs but we also support hybrid programs that run both pthreads and Wasm Workers. In this cases the pthread API is available, but will fail in undefined ways if called from Wasm Workers.
This document describes the implementation to improve the hybrid mode by adding the pthread
metadata (struct pthread) to each Wasm Worker, allowing the pthread API (or at
least some subset of it) APIs to used from Wasm Workers.
Normally, Wasm Workers allocate space for only TLS and stack: [TLS data] [Stack].
For hybrid mode (when pthreed are enabled as well as Wasm Workers) we changed
this to also include pthread-specific data: [struct pthread] [TSD pointers] [TLS data] [Stack].
The struct pthread is located at the very beginning of the allocated
memory block for each Wasm Worker.
struct pthread InitializationThe struct pthread is initialized by the creator thread in emscripten_create_wasm_worker (or emscripten_malloc_wasm_worker).
This includes:
self pointer to the start of the struct pthread.tid.On the worker thread side, initialization is completed by calling
__set_thread_state (via JS ___set_thread_state in libwasm_worker.js) to
set the thread pointer, making it available to __get_tp.
__get_tp SupportWe will modify system/lib/pthread/emscripten_thread_state.S to provide a
__get_tp implementation for Wasm workers that returns the address of the
struct pthread. This will allow __pthread_self() and other related functions
to work correctly.
We intend to support a subset of the pthread API within Wasm workers:
pthread_self(): Returns the worker's struct pthread pointer.pthread_equal(): Works normally.pthread_getspecific() / pthread_setspecific(): TSD (Thread Specific Data) should work if tsd field in struct pthread is initialized.pthread_mutex_*: Mutexes will work as they rely on struct pthread for owner tracking.pthread_cond_*: Condition variables will work as they rely on struct pthread for waiter tracking.struct pthread (e.g., some internal locks).APIs that will NOT be supported (or will have limited support):
pthread_create() / pthread_join() / pthread_detach(): Wasm workers have their own creation and lifecycle management.pthread_cancel(): Not supported in Wasm workers.pthread_kill(): Not supported in Wasm workers.emscripten_create_wasm_worker in system/lib/wasm_worker/library_wasm_worker.c to account for sizeof(struct pthread) in memory allocation and initialize the structure.$_wasmWorkerInitializeRuntime in src/lib/libwasm_worker.js to call ___set_thread_state to set the thread pointer.pthread_self()) work in Wasm workers in hybrid mode.pthread_self() and low level synchronization APIs
work when called from a Wasm Worker.