Testing
Automated tests
117 tests across three layers. All pass on a Max plan workstation.
npm test # unit + integration (~7s)
RUN_LIVE_TESTS=1 npm run test:e2e # live CLI + OpenClaw e2e (~30s)
RUN_LIVE_TESTS=1 npm test # everything (~37s)
| Layer | What it covers |
|---|---|
| Unit | scrubPrompt, unscrubResponse, buildUsage, buildMsg, resolveSessionKey, resolveAgentId, deriveTurnSessionKey, MCP config/bootstrap, model catalog, healthcheck |
| Integration | Mock CLI NDJSON scenarios, timeout, stderr capture, MCP agent identity, system prompt resume, concurrency, stale-resume recovery, workspaceDir routing, prompt extraction |
| E2E: OpenClaw | Plugin registration, provider exposure, agent smoke test |
| E2E: live CLI | Real Claude CLI response, scrubbed prompt acceptance |
| E2E: stream live | createClaudeCliStreamFn with real CLI + Max plan OAuth, session resume |
Integration test breakdown
| Test | What it validates |
|---|---|
| simple, streaming, assistant | Basic NDJSON event parsing and stream lifecycle |
| malformed | Malformed NDJSON lines skipped without crash |
| scrub, scrub-streaming | Detection token unscrubbing in result and delta paths |
| double-unscrub guard | endStream doesn’t re-unscrub already-processed streaming text |
| empty | Empty CLI output produces “(no response)” fallback |
| hang + timeout | Request timeout kills hung process, emits error (3s test) |
| stderr capture | CLI stderr included in error events for diagnostics |
| 4 concurrency tests | Parallel streams, session map integrity, same-key safety, no cross-contamination |
| MCP agent identity (3 tests) | OPENCLAW_MCP_AGENT_ID propagation, throws (not silent fallback) when agentId is unresolved |
| Stale –resume recovery | Error event surfaces real claude error text; bogus session_id from error is not persisted; cached id is dropped on resume failure |
| Auth-error surfacing | When errors[] is empty, the user-visible error uses data.result text and api_error_status (HTTP code) |
| workspaceDir migration | Claude is spawned with cwd = ctx.workspaceDir; sessions persist under <workspaceDir>/.glueclaw/sessions.json |
| Prompt extraction (Telegram) | Trailing Conversation info (untrusted metadata): user-role messages are skipped so the actual user text wins |
What the tests prove
- Auth path validated:
createClaudeCliStreamFndeletesANTHROPIC_API_KEY, forcing Max plan OAuth. Tested end-to-end on a real Max plan workstation. - Session resume works: Two sequential calls with the same session key, second call retrieves context from first via
--resume. - Stale-resume recovers automatically: When claude rejects a cached session id (e.g. project storage cleared), the next turn surfaces the real error and clears the cache; the turn after that succeeds against a fresh session. No silent
(no response)masking. - Per-agent isolation: With
ctx.workspaceDir, each agent’s claude project is anchored at a distinct cwd hash and its session-id cache lives at<workspaceDir>/.glueclaw/sessions.json. Cross-agent bleed is structurally impossible. - Channel inbound stays readable: Telegram-style
Conversation info (untrusted metadata):wraps are filtered out of prompt extraction; the user’s actual text wins. Session-key derivation (which needschat_id) keeps a separate filter that only skipsSender …blocks. - Concurrency safe: 3 parallel streams complete independently. Session file is valid JSON after concurrent writes. Same session key doesn’t crash under concurrent access.
- Crash resilient: Hung CLI killed after timeout. SIGKILL fallback if SIGTERM ignored. Atomic session writes prevent corruption.
Manual procedures
Smoke test
export GLUECLAW_KEY=local
openclaw agent --agent main --message "say pong" 2>&1 | tail -n 1
# Expected: pong
Multi-turn memory
In openclaw tui:
- “remember the word: mango” — expect acknowledgment
- “what word did I ask you to remember?” — expect “mango”
Detection check
# Scrubbed prompt — should pass
openclaw agent --agent main --message "say hi" 2>&1 | tail -n 1
# Raw OpenClaw trigger — should fail with 400
claude --append-system-prompt \
"You are a personal assistant running inside OpenClaw." \
-p "say hi" 2>&1
MCP bridge
In openclaw tui, ask “what MCP tools do you have access to from openclaw?” — expect a list including message, sessions_list, memory_search, web_search.
Multi-agent isolation
See the multi-agent guide for the full three-round bleed test. Short version: store a unique secret per agent, ask each to recall, ask each to self-identify via agents_list. Each agent must return its own secret and its own id — pre-#36 builds had every non-main agent reporting as main.