ClawGuard

An independent LLM-based safety reviewer that validates destructive tool calls against the user's original intent before execution. Prevents autonomous AI actions that contradict user instructions — such as the incident where an AI agent deleted emails despite the user saying "don't action until I tell you to".

What it does

ClawGuard intercepts destructive tool calls (delete, modify, exec, etc.) and compares them with the user's original message using a fast, small LLM. It decides:

pass — action aligns with user intent → execute normally
block — action clearly violates user intent → reject
escalate — uncertain → ask user for confirmation

Risk Classification

Level	Tools	ClawGuard Review
Red (destructive)	`exec`, `fs_delete`, `fs_move`, `sessions_spawn`, `sessions_send`, `gateway`, `message/deleteMessage`, `message/editMessage`, `cron/create,update,remove`, `nodes/run,invoke`	Yes
Yellow (mutating)	`fs_write`, `message/send`, `browser/act,navigate`	No
Green (safe)	`read`, `browser/snapshot,status`, `web_search`	No

Plugins can declare their own risk level via riskLevel: "safe" | "mutating" | "destructive".

Install

Prerequisites

An OpenClaw source checkout (git clone)
Git CLI available in PATH

Linux / macOS

chmod +x install.sh
./install.sh /path/to/openclaw

Windows (PowerShell)

.\install.ps1 -OpenClawRoot C:\path\to\openclaw

Configure

Add to your openclaw.json:

{
  "safety": {
    "guardian": {
      "enabled": true,
      "model": "openrouter/qwen/qwen3-32b"
    }
  }
}

enabled (boolean) — Enable/disable ClawGuard. Default: false.
model (string, optional) — LLM model reference. Default: openrouter/qwen/qwen3-32b.

Verify

Run the ClawGuard test suite:

cd /path/to/openclaw
npx vitest run src/security/guardian-risk.test.ts src/security/guardian-agent.test.ts src/security/guardian-hook.test.ts src/security/guardian-audit.test.ts src/security/guardian-config.test.ts src/security/guardian-scenario.test.ts

Expected: 120 tests passed.

Uninstall

Linux / macOS

./uninstall.sh /path/to/openclaw

Windows (PowerShell)

.\uninstall.ps1 -OpenClawRoot C:\path\to\openclaw

Then remove the "safety" key from your openclaw.json.

Architecture

User message ──┐
               ├──▶ Agent LLM ──▶ Tool Call ──▶ [before_tool_call hook]
               │                                        │
               │                              ┌─────────▼──────────┐
               │                              │  Risk Classification │
               │                              │  (guardian-risk.ts)  │
               │                              └─────────┬──────────┘
               │                                 safe?  │  destructive?
               │                                  │     │
               │                               pass     ▼
               │                              ┌─────────────────────┐
               └──────────────────────────────▶│   ClawGuard LLM     │
                  (original user message)      │   (qwen3-32b)       │
                                               └─────────┬──────────┘
                                                         │
                                              pass / block / escalate
                                                         │
                                                         ▼
                                               ┌──────────────────┐
                                               │   Audit Log       │
                                               │  (guardian-audit)  │
                                               └──────────────────┘

Key Design Decisions

Fail-closed: If the ClawGuard LLM is unavailable, times out, or returns unparseable output, the tool call is blocked (never silently allowed).
Audit trail: Every ClawGuard decision is persisted to ~/.openclaw/guardian/audit.jsonl.
Minimal latency: Only destructive tool calls are reviewed; safe/mutating operations pass through without LLM overhead.
Plugin support: Plugins declare riskLevel at registration; ClawGuard respects plugin-declared risk levels.

File Manifest

New Files

File	Description
`src/security/guardian-types.ts`	Type definitions (config, risk levels, decisions, audit)
`src/security/guardian-agent.ts`	LLM safety reviewer (prompt, model resolution, response parsing)
`src/security/guardian-audit.ts`	JSONL audit logging
`src/security/guardian-risk.ts`	Tool risk classification (static + plugin + sub-action)
`src/security/guardian-*.test.ts`	Test suite (120 tests)

Modified Files (via patch)

File	Change
`src/agents/pi-tools.before-tool-call.ts`	Guardian hook integration + `runGuardianCheck`
`src/agents/pi-tools.ts`	Pass `originalUserMessage`, `config`, `agentDir` to hook context
`src/agents/pi-embedded-runner/run/attempt.ts`	Pass user prompt as `originalUserMessage`
`src/config/types.openclaw.ts`	Add `safety?: SafetyConfig` to `OpenClawConfig`
`src/plugins/registry.ts`	Add `riskLevel` to `PluginToolRegistration`
`src/plugins/tools.ts`	Register plugin risk levels with guardian
`src/plugins/types.ts`	Add `riskLevel` to `OpenClawPluginToolOptions`

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
files/src/security		files/src/security
patches		patches
README.md		README.md
install.ps1		install.ps1
install.sh		install.sh
uninstall.ps1		uninstall.ps1
uninstall.sh		uninstall.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ClawGuard

What it does

Risk Classification

Install

Prerequisites

Linux / macOS

Windows (PowerShell)

Configure

Verify

Uninstall

Linux / macOS

Windows (PowerShell)

Architecture

Key Design Decisions

File Manifest

New Files

Modified Files (via patch)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ClawGuard

What it does

Risk Classification

Install

Prerequisites

Linux / macOS

Windows (PowerShell)

Configure

Verify

Uninstall

Linux / macOS

Windows (PowerShell)

Architecture

Key Design Decisions

File Manifest

New Files

Modified Files (via patch)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages