Back to Weknora

WeKnora — retrieval & RAG queries

cli/skills/weknora-rag-search/SKILL.md

0.6.23.5 KB
Original Source

WeKnora — retrieval & RAG queries

REQUIRED BACKGROUND: read the weknora-shared skill first (auth, --kb resolution, the JSON envelope, exit codes, streaming/NDJSON output).

WeKnora gives you several ways to "ask about a knowledge base." Picking the wrong one wastes turns or returns the wrong shape. Use the decision table.

Pick the command by your goal

Your goalCommandLLM synthesis?Returns
Natural-language answer grounded in a KBchat "<q>" --kb <kb>yesstreaming answer + references
Answer via a custom agent (its own KB scope, tools, web search)session ask --agent <id> "<q>"yes (+ tools)streaming answer + tool events
Raw context chunks to reason over yourself (no answer)search chunks "<q>" --kb <kb>noranked chunk list
Which documents match a keyword (title/filename)search docs "<q>" --kb <kb>nodocument list
Find a knowledge base by namesearch kb "<q>"noKB list
Find a past session by titlesearch sessions "<q>"nosession list

The three decisions that matter

  1. Answer vs raw context. Want a written answer → chat / session ask. Want chunks to feed into your own reasoning (e.g. you'll synthesize across sources) → search chunks. Don't call chat just to read source text.
  2. chat vs session ask. chat = plain KB RAG Q&A. session ask --agent <id> = invoke a configured custom agent (it may scope its own KBs, call tools, do web search). If the user set up an agent for this, prefer it (weknora agent list to find ids); otherwise chat.
  3. One-shot vs multi-turn. Both chat and session ask print an init event with a session_id. Pass --session <id> on the next call to continue the conversation. See references/chat.md.

Safety / Gotchas

  • chat, search chunks, search docs need a KB: pass --kb <id-or-name>, or set WEKNORA_KB_ID, or weknora link the directory (resolved in that order). If none resolves it's exit 1 (local.kb_id_required); a bad name is exit 1 (local.kb_not_found). Resolve names with weknora kb list / search kb. (search kb / search sessions are tenant-wide and take no --kb.)
  • chat / session ask stream NDJSON by default. Parse line-by-line; keep the init event's session_id and message_id. Use --format text only for a human transcript.
  • A stalled stream is not stopped by Ctrl-C (that just drops your local connection; the server keeps generating + billing). Stop it server-side: weknora session stop <session-id> --message <message-id> (ids from init). Re-attach to a stream with weknora session continue-stream <session-id> --message <message-id>.
  • search chunks --limit defaults to 8 (tuned for an LLM context window); the search docs/kb/sessions lists default to 30. Tune retrieval with --vector-threshold / --keyword-threshold, or --no-vector/--no-keyword to disable a channel. Details: references/search-chunks.md.

Quick examples

bash
# raw retrieval to reason over
weknora search chunks "retry backoff policy" --kb engineering --limit 12

# grounded answer (human transcript)
weknora chat "How do we handle retries?" --kb engineering --format text

# continue the conversation (session id from the prior init event)
weknora chat "And the max attempts?" --kb engineering --session sess_abc

# answer via a custom agent
weknora session ask --agent ag_123 "Summarize this quarter's incidents"