docs/features/filesystem-user-isolation.md
Generated on: 2026-05-07 Status: Draft Owner: Platform / Tools
FileSystemToolComponent exposes five sandboxed file operations (read_file,
write_file, edit_file, glob_search, grep_search) to LLM agents. This
feature scopes those operations to a per-user workspace whenever Langflow
runs in multi-user mode, and to a shared workspace when Langflow runs in
single-user (auto-login) mode. The dispatch is implicit, driven by the
existing AUTO_LOGIN setting; the only operator knob is the on-disk root
directory.
Before this feature, the FileSystem tool had a single sandbox configured by the flow author. In a shared Langflow deployment two distinct authenticated users running the same flow saw each other's files — a cross-tenant data leak that blocked the component from being enabled in any multi-user environment.
Earlier hardening attempts (PR #12901 + the LANGFLOW_FS_TOOL_USER_ISOLATION
follow-up) introduced three modes (off/auto/on) and four environment
variables. That model never reached production, was hard for operators to
reason about, and required documentation to disambiguate. This feature
collapses all of it to a single binary decision: shared if AUTO_LOGIN is on,
isolated otherwise.
Context: Tools — sandboxed integrations exposed to LLM agents.
This context owns:
| Context | Relationship | Notes |
|---|---|---|
Auth | Conformist | Reads settings_service.auth_settings.AUTO_LOGIN as authoritative. The Tools context never decides what AUTO_LOGIN means; it only adapts to it. |
Components | Customer-Supplier | FileSystemToolComponent extends the Component base class and consumes self._user_id populated by the runtime during instantiate_class(). |
Agent | Customer-Supplier | The Agent invokes the StructuredTool closures returned by _get_tools(). The contract is "JSON-serializable result envelope, never raise". |
| Term | Definition | Code Reference |
|---|---|---|
| Sandbox | The on-disk directory subtree all FileSystem tool calls are confined to. | FileSystemToolComponent._validate_root |
| Base Directory (BASE) | The operator-controlled root under which every workspace lives. | IsolationConfig.base_dir, env LANGFLOW_FS_TOOL_BASE_DIR |
| Workspace Sub-path | The flow-author-controlled sub-directory inside the resolved namespace. Optional; empty means "namespace root". | FileSystemToolComponent.root_path (UI: "Workspace Sub-path") |
| Shared Mode | Layout used when AUTO_LOGIN=True. Single workspace under <BASE>/shared/. | FileSystemToolComponent._shared_root, constant SHARED_NAMESPACE = "shared" |
| Isolated Mode | Layout used when AUTO_LOGIN=False with an authenticated user. Per-user workspace under <BASE>/users/<hash>/. | FileSystemToolComponent._isolated_user_root |
| Refused Mode | Conceptual third state: AUTO_LOGIN=False without a user_id. No filesystem access; structured error returned. | FileSystemToolComponent._validate_root (raise branch) |
| Namespace | The opaque per-user directory name in isolated mode. Derived from user_id via HMAC-SHA256 with a server-side pepper, truncated to 32 hex chars (128 bits). | compute_user_namespace, file _filesystem_namespace.py |
| Pepper | Per-instance secret used to key the user_id → namespace HMAC. Auto-generated 32 random bytes on first call, persisted at <BASE>/.fs_pepper with mode 0600. | load_or_create_pepper |
| Reserved Segment | The directory name .lfsig, blocked at _validate_path in every mode. Holds the integrity hook for a future content-signing layer; agents and users must never see or write to it. | RESERVED_SEGMENT = ".lfsig" |
| Tool Binding (L2) | The check that captures user_id at _get_tools() time and refuses tool calls where the live user_id has shifted. Active only in isolated mode. | _user_binding_error |
| Boundary Check | The is_relative_to test that rejects any candidate path resolving outside the namespace root after ../symlink expansion. | _validate_path, _isolated_user_root, _shared_root |
| Resolution Error | Structured error string surfaced in build_metadata.resolution_error when _validate_root cannot resolve (e.g., unwritable BASE). Never leaks to a raw exception. | FileSystemToolComponent.build_metadata |
FileSystemToolComponentIsolationConfig (frozen dataclass: base_dir, pepper_path)Namespace (Path("users/<hash>") or Path("shared"))is_relative_to(namespace_root)..lfsig segment MUST never appear in any path the agent supplies.AUTO_LOGIN=False and user_id is empty, no I/O may occur.IsolationConfig is immutable — re-read from env on every call.PEPPER_SIZE_BYTES (32) on every read.(pepper, user_id) tuple in isolated mode; constant "shared" in shared mode.(pepper, user_id) always maps to the same hash. Different pepper or different user_id → different hash.user_id collapses to Path(""), used by callers as the "no namespace" marker (legacy compatibility within the module — not surfaced).This feature is synchronous and request-scoped. It does not publish or subscribe to any domain events. Each tool call is a self-contained operation whose outcome is a JSON envelope returned to the caller.
(If a future audit/event layer is added, it would emit events like
fs.tool.invoked with (user_id, action, path, ok). That layer was
explicitly removed from scope — see ADR-005.)
As an LLM Agent running on a Langflow instance I want sandboxed filesystem operations scoped to my caller's identity So that files created by one user are never readable by another user on the same instance, and single-user installs don't pay any namespacing overhead.
LANGFLOW_FS_TOOL_BASE_DIR set (or unset, falling back to default).LANGFLOW_AUTO_LOGIN configured to True or False at the platform level.""write_file("dog.md", "...")<BASE>/shared/dog.mdusers/ directory is created under <BASE>mode: "shared", auto_login: true, and the resolved effective_root._user_id = "alice"write_file("notes.md", "...")<BASE>/users/<hash(alice)>/notes.md<hash(alice)> is a 32-character hex string derived from HMAC-SHA256(pepper, "alice")mode: "isolated".alice has previously written <BASE>/users/<hash(alice)>/secret.txtbob (different user_id) invokes read_file("secret.txt"){"error": "File not found: secret.txt", "path": "secret.txt"}<BASE>/users/<hash(bob)>/ but does NOT contain secret.txt._user_id set (anonymous){"error": "FileSystemTool requires an authenticated user when AUTO_LOGIN=False", ...}<BASE>.read_file("../../etc/passwd")<BASE>/shared/ is read.bobalice has written <BASE>/users/<hash(alice)>/secret.txtread_file("../../<hash(alice)>/secret.txt").lfsig segment is blocked in every modewrite_file(".lfsig/poison.json", "{}") or read_file(".lfsig/anything").lfsig/ directory is never created.LANGFLOW_FS_TOOL_BASE_DIR=/var/lfx/fs on a system where /var/lfx is not writable by the Langflow process userCannot create user namespace at /var/lfx/fs/users/...: Permission denied.
Check that LANGFLOW_FS_TOOL_BASE_DIR (/var/lfx/fs) is writable by the Langflow
process user.
metadata.resolution_error carries the same stringComponentBuildError.aliceread_file StructuredTool via _get_tools()_user_id is reassigned to bob and the captured tool is invoked{"error": "tool/user-id mismatch: ..."}_get_tools()_user_id is changed before invocationuser_id is not part of the security boundary in shared mode.leak.txt""read_file("leak.txt")<BASE>/shared/, not under CWD).Status: Accepted
The previous iteration introduced LANGFLOW_FS_TOOL_USER_ISOLATION with three values (off/auto/on). Operators had to understand what each meant, when to flip them, and how they composed with the platform's AUTO_LOGIN setting. In practice the right answer was always "match AUTO_LOGIN":
AUTO_LOGIN=True → there is one administrative user, isolation is meaningless.AUTO_LOGIN=False → there are multiple authenticated users, isolation is required.The dedicated env var added configuration surface without expressing any behavior the operator could not derive from existing settings.
Remove LANGFLOW_FS_TOOL_USER_ISOLATION entirely. Read settings_service.auth_settings.AUTO_LOGIN at every call and dispatch:
AUTO_LOGIN=True → shared layout under <BASE>/shared/.AUTO_LOGIN=False + authenticated user → isolated layout under <BASE>/users/<hash>/.AUTO_LOGIN=False + anonymous → refuse with structured error.Benefits:
Trade-offs:
AUTO_LOGIN. Considered a corner case.get_settings_service(); a defensive try/except returns True (safer default) when the service registry is not yet initialized.Impact on Product:
Status: Accepted
Per-user directories must be:
user_id always maps to the same directory across process restarts.<BASE>/users/ should not reveal the set of users who ever used the tool.Compute the namespace as HMAC-SHA256(pepper, user_id).hexdigest()[:32]. The pepper is 32 random bytes generated and persisted on first boot at <BASE>/.fs_pepper (mode 0600 on POSIX).
Benefits:
users/ leaks zero information.Trade-offs:
<BASE>/.fs_pepper) lets the attacker enumerate the user → directory mapping. Mitigated by 0600 mode on POSIX. On Windows we inherit the parent directory's NTFS DACL — operators using Windows in security-sensitive deployments must verify ACLs separately.Impact on Product:
ls users/ | wc -l) without exposing identities..lfsig segment in path validation, even though no signing layer ships yetStatus: Accepted
A future iteration may add per-file HMAC sidecars for content integrity. If we add that capability later but don't reserve the directory now, an agent that pre-creates .lfsig/ files today will collide with the integrity layer or, worse, poison it before it ships.
Reject any path containing .lfsig as a segment in _validate_path, in every mode, today. The reservation is enforced from day one even though no consumer of the directory exists yet.
Benefits:
Trade-offs:
.lfsig for unrelated reasons cannot create one. Documented as a known limitation; the segment name was chosen to be unlikely to collide with real-world content.Impact on Product:
off mode and the LANGFLOW_FS_TOOL_ALLOWED_ROOTS allowlistStatus: Accepted (supersedes the corresponding portions of the prior design)
The off mode allowed flow authors to point root_path at any absolute path; the operator could then constrain that to a list via LANGFLOW_FS_TOOL_ALLOWED_ROOTS. The model offered flexibility but at the cost of a defensible default. The mode never reached production.
Delete _legacy_validate_root, _allowed_roots, the LANGFLOW_FS_TOOL_ALLOWED_ROOTS env var, and the IsolationMode.OFF enum value. The component is always sandboxed under <BASE> — there is no mode in which a flow author can choose an absolute path.
Benefits:
Trade-offs:
LANGFLOW_FS_TOOL_BASE_DIR directly at that path.Impact on Product:
<BASE>, period.Status: Accepted
The earlier design specified LANGFLOW_FS_TOOL_AUDIT_LOG writing one NDJSON line per tool call. The audit was useful for forensic queries ("who read X last Tuesday?") but the product team confirmed there is no compliance or operational requirement that depends on it today. The audit module was 80+ lines of code with its own test suite (12 tests).
Delete _filesystem_audit.py and its test file. Remove the _audit_sink, _audit, and _resolve_flow_id methods from the component. Drop all self._audit(...) call sites.
Benefits:
Trade-offs:
_read_file/_write_file/etc. would just gain a single self._audit(...) line each).Impact on Product:
Status: Accepted
The previous design exposed LANGFLOW_FS_TOOL_PEPPER_PATH as a separate knob. Operators almost always wanted the pepper alongside the sandbox; pointing them to different paths was a footgun (move BASE without moving pepper → existing user_id hashes drift, every user appears as new).
The pepper path is always <BASE>/.fs_pepper. No env var to override it.
Benefits:
Trade-offs:
Impact on Product:
<BASE> tree as a unit.| Type | Name | Purpose |
|---|---|---|
| Module | lfx.services.deps.get_settings_service | Read AUTO_LOGIN from auth_settings. |
| Module | lfx.components.tools._filesystem_isolation | Resolve BASE_DIR + pepper path from env into immutable IsolationConfig. |
| Module | lfx.components.tools._filesystem_namespace | Pepper persistence + user_id → namespace HMAC. |
| Library | langchain_core.tools.StructuredTool | Wrap each operation as an Agent-callable tool. |
| Library | pydantic.BaseModel | Per-tool Pydantic args schemas. |
| Stdlib | hashlib.sha256 + hmac | Namespace derivation. |
| Stdlib | secrets.token_bytes | Pepper generation on first boot. |
| Stdlib | os (`O_CREAT | O_EXCL`) |
| Stdlib | pathlib.Path (is_relative_to, resolve) | Boundary enforcement. |
| Stdlib | pathlib.PureWindowsPath | Cross-platform reserved-name / forbidden-char detection. |
This feature does NOT depend on:
The feature does not expose new HTTP endpoints. Its public surface is the
five tools returned from FileSystemToolComponent._get_tools(). Each tool
is invoked by the Agent and returns a JSON-encoded string.
read_filePurpose: Read a text file from the resolved sandbox.
Args (validated by _ReadFileArgs):
{
"path": "string — relative to the sandbox root",
"offset": "int? — 1-based start line",
"limit": "int? — max lines"
}
Response (Success):
{
"status": "ok",
"path": "...",
"content": " 1→...\n 2→...",
"total_lines": 42,
"start_line": 1,
"num_lines": 10
}
Response (Error):
{"error": "<reason>", "path": "..."}
write_fileArgs: { "path": "...", "content": "..." }
Response (Success):
{"status": "created" | "updated", "path": "...", "bytes_written": 42}
edit_fileArgs: { "path": "...", "old_string": "...", "new_string": "...", "replace_all": false }
Response (Success):
{"status": "ok", "path": "...", "replacements": 1, "old_string": "...", "new_string": "..."}
glob_searchArgs: { "pattern": "**/*.md", "path": "optional sub-dir" }
Response (Success):
{
"status": "ok",
"pattern": "...",
"matches": ["a.md", "nested/b.md"],
"truncated": false,
"truncated_branches": []
}
grep_searchArgs:
{
"pattern": "...",
"path": "optional",
"glob": "optional *.py",
"case_insensitive": false,
"output_mode": "files_with_matches" | "content" | "count",
"is_regex": false
}
All errors return the structured envelope {"error": "<message>", "path": "..."}. The component never raises out of public tool methods.
| Error Code (substring) | Condition | User Message | Recovery Action |
|---|---|---|---|
requires an authenticated user | AUTO_LOGIN=False with no _user_id. | FileSystemTool requires an authenticated user when AUTO_LOGIN=False | Inject a user_id upstream, or run with AUTO_LOGIN=True. |
escapes / boundary | .., absolute path, or symlink resolves outside the namespace. | Path escapes workspace boundary: <path> | Use a path inside the sub-tree the tool resolves. |
reserved | Path contains .lfsig segment. | Path component '.lfsig' is reserved | Choose a different sub-directory name. |
Permission denied (wrapped) | BASE_DIR is not writable. | Cannot create user namespace at <path>: Permission denied. Check that LANGFLOW_FS_TOOL_BASE_DIR (<value>) is writable by the Langflow process user. | Point BASE_DIR at a writable directory and restart. |
File not found | Read/edit on a non-existent path inside the namespace. | File not found: <path> | Verify the path; it may belong to a different user (isolated mode). |
exceeds limit | File or projected content > 10 MB. | Content size <n> exceeds limit of 10485760 bytes | Split the file or use a different tool. |
binary file | read_file invoked on a file containing NUL bytes in the first 8 KB. | Refusing to read binary file: <path> | None — binary read is not supported. |
Invalid regex / catastrophic-backtracking | grep_search with is_regex=True and a pathological pattern. | Regex pattern rejected: nested unbounded quantifier ... | Rewrite pattern without nested quantifiers, or use literal mode. |
tool/user-id mismatch | Captured tool invoked after _user_id change in isolated mode. | tool/user-id mismatch: this tool was bound to a different user session and cannot be reused | Re-build the tool list (re-run _get_tools()); fix the upstream pool that reused the component instance. |
| Variable | Type | Default | Effect |
|---|---|---|---|
LANGFLOW_FS_TOOL_BASE_DIR | absolute path | <config_dir>/fs_sandbox (~/.langflow/fs_tool/fs_sandbox on a default install) | Sandbox root on disk. |
LANGFLOW_AUTO_LOGIN | bool | True | (Existing platform var; not owned by this feature.) Drives shared/isolated dispatch. |
Component-level inputs:
| Field | Type | Default | Purpose |
|---|---|---|---|
root_path (UI label "Workspace Sub-path") | str | "" | Sub-directory inside the resolved namespace. Empty means the namespace root. |
read_only | bool | False | Disables write_file and edit_file. |
This feature does not emit metrics in its current form. Standard application logging captures errors; per-call forensic data was deliberately not implemented (see ADR-005).
If observability is added later, the recommended shape is:
| Metric | Type | Description | Alert Threshold |
|---|---|---|---|
filesystem_tool.calls_total | Counter (labels: action, outcome) | Tool invocations. | None — used for capacity planning. |
filesystem_tool.boundary_violations_total | Counter (labels: mode) | Path-traversal / .lfsig rejections. | > 10/min sustained: investigate possibly malicious agent prompt. |
filesystem_tool.refused_total | Counter | AUTO_LOGIN=False + anonymous attempts. | > 5/min: investigate misconfigured caller (cron / MCP without auth). |
The component does not emit structured logs of its own — the Agent layer already logs tool invocations and their outcomes. Operator-facing failures appear as application logs from the Agent runtime, with the structured error string from this feature included verbatim.
No dedicated dashboard. Operational state is observable via the node's
metadata output JSON — the values of auto_login, mode,
effective_root, and resolution_error are sufficient for ad-hoc
diagnostics.
This feature does NOT ship behind a runtime flag. The behavior is always
active — the only knob is the existing AUTO_LOGIN setting which the
operator already configures for the entire Langflow instance.
None. The feature is fully filesystem-resident.
Reverting the code restores the prior FileSystem tool behavior (which had no per-user isolation). Steps:
git revert <merge-commit> and redeploy.~/.langflow/fs_tool/fs_sandbox/ — the directory
layout from before the revert (shared/ and/or users/<hash>/) is left
intact; the prior code does not read from it but also does not delete it.rm -rf the
sandbox; no other Langflow component references it.Data considerations:
<BASE>/shared/ or <BASE>/users/<hash>/ are owned by
the deployment, not by Langflow. A revert does not move or delete them;
it just stops the feature from being aware of them.<BASE>/.fs_pepper should be preserved across a revert
if you intend to re-roll-forward later — losing it changes every user's
hash, effectively orphaning all per-user files.Dependencies to roll back first: none. The feature has no upstream consumers of its data; reverting it does not require coordinating with other rollbacks.
Post-deploy verification (manual or scripted; see CZL/FILESYSTEM_USER_ISOLATION_QA_GUIDE.md for the full QA procedure):
"create dog.md with a short story". Confirm <BASE>/shared/dog.md exists (in single-user / AUTO_LOGIN=True deployments).<BASE>/users/<hash>/ directories and cannot read each other's files.../etc and confirm a structured error envelope with no on-disk side effects.LANGFLOW_FS_TOOL_BASE_DIR pointing at an unwritable path; confirm the agent receives a friendly error mentioning the env var.metadata output: confirm auto_login, mode, and effective_root reflect the deploy.C4Context
title FileSystem Tool — System Context
Person(author, "Flow Author", "Designs flows in Langflow UI")
Person(end_user, "Authenticated User", "Triggers flows; identity scopes the sandbox")
System(langflow, "Langflow", "AI flow runtime hosting Agent + Tools")
System_Ext(disk, "Host Filesystem", "POSIX/NTFS volume hosting <BASE>")
Rel(author, langflow, "Configures Workspace Sub-path on the FileSystem node")
Rel(end_user, langflow, "Authenticates and runs flows")
Rel(langflow, disk, "Sandboxed read/write under <BASE>")
C4Container
title FileSystem Tool — Containers
Container(agent, "Agent", "LangChain", "Plans tool calls")
Container(fs_tool, "FileSystemToolComponent", "Python (lfx)", "5 sandboxed tools")
Container(auth, "Settings Service", "Python", "Owns AUTO_LOGIN, SUPERUSER, etc.")
ContainerDb(sandbox, "Sandbox Tree", "Filesystem", "<BASE>/shared/ or <BASE>/users/<hash>/")
ContainerDb(pepper, "Pepper File", "Filesystem", "<BASE>/.fs_pepper (mode 0600)")
Rel(agent, fs_tool, "Invokes tool with JSON args")
Rel(fs_tool, auth, "Reads AUTO_LOGIN at call time")
Rel(fs_tool, sandbox, "Read/write inside resolved namespace")
Rel(fs_tool, pepper, "HMAC keying for isolated mode")
FileSystemToolComponentflowchart TD
A[Agent calls tool e.g. read_file] --> B[StructuredTool closure _run]
B --> C{_user_binding_error}
C -- mismatch --> R1[Return error envelope]
C -- ok --> D[_validate_path]
D --> E[_validate_root]
E --> F{_resolve_auto_login}
F -- True --> G[_shared_root]
F -- False --> H{user_id present?}
H -- no --> R2[raise PermissionError]
H -- yes --> I[_isolated_user_root]
I --> J[load_or_create_pepper]
I --> K[compute_user_namespace]
G --> L[mkdir under shared/, return path]
K --> L2[mkdir under users hash, return path]
L --> M[is_relative_to boundary check]
L2 --> M
M -- escape --> R3[raise PermissionError]
M -- ok --> N[return resolved Path]
N --> O[Tool implementation: read/write/edit/glob/grep]
O --> P[Return JSON envelope]
Filesystem semantics are the most platform-divergent surface in the entire feature. Every test in the suite runs unmodified on POSIX, NTFS, and APFS; every path is built with
pathlib; reserved-name detection usesPureWindowsPathso a flow authored on macOS does not silently break on Windows.
| Platform | Versions | Architecture | Status |
|---|---|---|---|
| Linux | Ubuntu 22.04+, Debian 12+ | x86_64, arm64 | Supported |
| macOS | 13+ | x86_64, arm64 | Supported |
| Windows | 10 22H2, 11 | x86_64 | Supported |
| Docker | linux/amd64, linux/arm64 | — | Supported (uses the Linux base image) |
| Capability | Linux | macOS | Windows | Notes |
|---|---|---|---|---|
| Pepper file create | os.O_CREAT | O_EXCL + mode=0600 | same | Path.write_bytes (NTFS DACL inherited) | Operators on Windows must verify ACL on <BASE>/.fs_pepper separately. |
| Path resolution | pathlib.Path.resolve() | same | same | Symlinks resolved cross-platform. |
| Reserved-name guard | n/a (Linux accepts most names) | n/a | CON, PRN, AUX, NUL, COM1-9, LPT1-9 rejected | Applied on every host so flows authored on macOS don't break on Windows. |
| Forbidden-char guard | n/a | n/a | <>"|?* rejected | Same: applied universally. |
| Trailing dot/space guard | n/a | n/a | Stripped silently by NTFS — rejected up front | Applied universally. |
is_relative_to boundary | Available since Python 3.9 | same | same | Single implementation across OSes. |
O_NOINHERIT-equivalent would require pywin32/ntsecurity APIs and is out of scope. Operators on Windows in security-sensitive deployments should set the ACL explicitly during provisioning of <BASE>.<BASE> that point outside the namespace are caught by the is_relative_to boundary check.Foo.txt and reads foo.txt will succeed on macOS/Windows and fail on Linux. This is filesystem-inherent, not feature-specific. The component does not normalize case.The feature is shipped as part of the lfx package and ships with Langflow itself; there is no separate installation step. The only setup is the optional environment variable.
# Default (recommended): unset, falls back to ~/.langflow/fs_tool/fs_sandbox.
unset LANGFLOW_FS_TOOL_BASE_DIR
# Or pin to an explicit writable directory:
export LANGFLOW_FS_TOOL_BASE_DIR="$HOME/lfx-data/fs_sandbox"
# Default: do not set, falls back to %USERPROFILE%\.langflow\fs_tool\fs_sandbox
Remove-Item Env:\LANGFLOW_FS_TOOL_BASE_DIR -ErrorAction SilentlyContinue
# Or pin to an explicit writable directory:
$env:LANGFLOW_FS_TOOL_BASE_DIR = "C:\Users\<you>\lfx-data\fs_sandbox"
# docker-compose.yml fragment
services:
langflow:
environment:
- LANGFLOW_FS_TOOL_BASE_DIR=/data/fs_sandbox
volumes:
- lfx-fs-data:/data/fs_sandbox
volumes:
lfx-fs-data:
A persistent volume is required so the pepper file (and per-user namespaces) survive container restarts.
| OS | Unit Tests | Integration | E2E | Smoke (Docker) |
|---|---|---|---|---|
| Ubuntu (latest) | ✅ | ✅ | ➖ | ✅ |
| macOS (latest) | ✅ | ✅ | ➖ | ➖ |
| Windows (latest) | ✅ | ✅ | ➖ | ➖ |
➖ End-to-end browser tests are not part of this feature's scope — the component has no UI behavior beyond standard input rendering, and the contract is fully exercisable from the Python test suite. Docker smoke runs only on Linux because that is the only OS we ship a Langflow image for.