Threat Detection component for GitHub Agentic Workflows. Analyzes AI agent output for security threats including prompt injection, secret leaks, and malicious patches.
- Quick Start
- Overview
- Guardrails and Security Considerations
- Usage
- Stage Status and Decisions
- Private Container Release Setup
- Development
- Architecture
- Integration with gh-aw
- Specification
- Contributing
- Support
- Code of Conduct
- Security
- License
Build the CLI and run it against an artifacts directory:
make build
./bin/threat-detect /path/to/artifactsRun tests locally:
make testThis tool runs as a standalone binary or container that analyzes artifacts produced by AI agents before safe outputs are permitted. It supports multiple AI engines (Copilot, Claude, Codex) for detection analysis.
This project is designed to help reduce risk when running AI agent workflows by inspecting generated artifacts before they are accepted as safe output. Detection is advisory and should be combined with defense-in-depth controls such as least-privilege permissions, human review, and repository protections.
Do not treat a "safe" result as a security guarantee. Use the output as one signal in a broader security review process.
threat-detect [flags] <artifacts-dir>Flags:
--engine— AI engine to use (copilot,claude,codex). Default:copilot--model— Model override for the engine--prompt-template— Path to custom prompt template--output— Path to write JSON result (defaults to stdout)--version— Print version and exit
Exit codes:
0— Safe (no threats detected)1— Threat detected2— Infrastructure/configuration error
docker run --rm \
-v /path/to/artifacts:/workspace/artifacts \
ghcr.io/github/gh-aw-threat-detection:latest \
/workspace/artifactsThe current image is suitable for CLI packaging and smoke testing. Production AI-backed detection requires the selected engine CLI and its authentication to be available in the runtime image or installed by a future image variant.
<artifacts-dir>/
├── aw-prompts/
│ └── prompt.txt # Workflow prompt file
├── agent_output.json # Agent structured output
├── aw-*.patch # Git format-patch files (optional)
├── aw-*.bundle # Git bundle files (optional)
└── comment-memory/ # Agent comment memory (optional)
└── *.md
{
"prompt_injection": false,
"secret_leak": false,
"malicious_patch": false,
"reasons": []
}The extraction staging model is:
- Stage 1: standalone repository
- Stage 2: containerization
- Stage 3:
github/gh-awintegration
Stage 1 is functionally represented in this repository.
The standalone Go CLI, artifact reader, prompt builder, result parser, engine abstraction, W3C-style specification, unit tests, CI, Dockerfile, and release workflow are present.
Remaining work involves integration with github/gh-aw and production hardening of the container runtime in Stage 2/3, not additional JavaScript porting in this repository.
Decisions for the unresolved extraction questions:
- JavaScript scripts: detection setup and result parsing are implemented in Go here; the old GitHub Actions JavaScript scripts should not be needed once
gh-awswitches to the container contract. - Engine CLIs: the detector invokes the selected engine CLI from
PATHand forwards the--modelvalue. Stage 2 should either bundle the supported CLIs into the image or publish an image variant that installs them at runtime before invokingthreat-detect. - Custom steps: custom
threat-detection.stepsremain orchestrator-owned. They should run before or after the container in thegh-awjob rather than being passed into this container as arbitrary scripts. - Backward compatibility:
gh-awshould pin a specific image tag and may temporarily keep inline detection behind a compatibility flag until the container path is validated. - Ollama/LlamaGuard: keep this as a custom-step pattern unless a dedicated image variant is explicitly required.
- Version coupling: use strict, semver-compatible image tags and have
gh-awpin a specificDefaultThreatDetectionVersion, matching the firewall pattern. - Isolation: the detector should run in the standard detection job initially. Running the detector itself inside an additional firewall/isolation layer can be evaluated later.
The repository can move through Stage 2 while remaining private. GitHub Container Registry supports private packages, and the release workflow publishes ghcr.io/github/gh-aw-threat-detection:<tag> using the automatic GITHUB_TOKEN with packages: write.
Maintainers need to configure the following before the image is consumed by gh-aw:
- Keep Actions enabled for this private repository.
- Ensure the package created under
ghcr.io/github/gh-aw-threat-detectioninherits repository visibility or is explicitly private. - Grant the consuming
github/gh-awrepository access to the private package, or configure the organization package settings soGITHUB_TOKENfromgh-awcan pull it withpackages: read. - Keep the
release-publishandrelease-promoteenvironments if manual approval is desired; otherwise update the environment protection rules in repository settings. - Tag releases with semantic versions such as
v1.0.0. The release workflow publishes the version tag; the promote workflow tags the verified digest aslatest.
No additional secrets are required for unit tests, make build, make test, or the container smoke test. Engine authentication is only needed when running real AI-backed detection:
| Variable | Required when | Notes |
|---|---|---|
GH_AW_COPILOT_TOKEN |
Running --engine copilot in an environment that needs explicit token-based Copilot authentication |
If the Copilot CLI uses device, browser, or host-provided authentication instead, configure that mechanism before running the container. |
ANTHROPIC_API_KEY |
Running --engine claude with the Claude CLI |
Not used by unit tests. |
OPENAI_API_KEY |
Running --engine codex with the Codex CLI |
Not used by unit tests. |
WORKFLOW_NAME |
Optional local/container runs | Included in the generated prompt. |
WORKFLOW_DESCRIPTION |
Optional local/container runs | Included in the generated prompt. |
CUSTOM_PROMPT |
Optional local/container runs | Appended to the default detection prompt. |
- Go 1.23+
- Docker (for container builds)
This repository includes three Agentic Workflows smoke tests:
.github/workflows/smoke-copilot.md.github/workflows/smoke-claude.md.github/workflows/smoke-codex.md
Each runs daily and by workflow_dispatch. The top-level Smoke workflow can be dispatched manually to start all three compiled smoke workflows and their three containerized siblings. The matching .lock.yml files are the compiled AW workflows. The *-container.lock.yml siblings are generated from those lock files by scripts/create-threat-detection-sibling-workflows.py; they pull the ghcr.io/github/gh-aw-threat-detection container, extract its detector binary, and execute it under the same AWF wrapper used by the generated detection job.
After recompiling the smoke workflows with gh aw compile, regenerate and verify the sibling workflows:
scripts/create-threat-detection-sibling-workflows.py
scripts/create-threat-detection-sibling-workflows.py --checkConfigure these Actions secrets to enable all smoke workflows:
| Secret | Required for | Notes |
|---|---|---|
COPILOT_GITHUB_TOKEN |
Copilot smoke workflow and base Copilot detection | See Copilot fallback note below. |
GH_AW_COPILOT_TOKEN |
Optional Copilot token fallback for the container-detection sibling | Used only if COPILOT_GITHUB_TOKEN is not configured. |
ANTHROPIC_API_KEY |
Claude smoke workflow and Claude detection | Used by the Claude CLI. |
OPENAI_API_KEY or CODEX_API_KEY |
Codex smoke workflow and Codex detection | Configure whichever token your Codex CLI setup expects. |
GH_AW_GITHUB_TOKEN |
Recommended for GitHub MCP access, safe outputs, and private GHCR pulls | The generated workflows fall back to GITHUB_TOKEN where possible. |
GH_AW_GITHUB_MCP_SERVER_TOKEN |
Optional GitHub MCP override | Falls back to GITHUB_TOKEN in the compiled workflows. |
Copilot fallback note: the base workflow uses only secrets.COPILOT_GITHUB_TOKEN. The container-detection Copilot sibling checks secrets.COPILOT_GITHUB_TOKEN, then secrets.GH_AW_COPILOT_TOKEN, then secrets.GH_AW_GITHUB_TOKEN.
Optional Actions variables:
| Variable | Purpose |
|---|---|
GH_AW_MODEL_AGENT_COPILOT, GH_AW_MODEL_AGENT_CLAUDE, GH_AW_MODEL_AGENT_CODEX |
Override the agent model for each smoke workflow. |
GH_AW_MODEL_DETECTION_COPILOT, GH_AW_MODEL_DETECTION_CLAUDE, GH_AW_MODEL_DETECTION_CODEX |
Override the detection model for each engine. |
GH_AW_THREAT_DETECTION_IMAGE |
Override the detector image used by the *-container.lock.yml siblings. Defaults to ghcr.io/github/gh-aw-threat-detection:latest. |
make buildmake testmake lintmake docker-buildRun the container smoke test:
make docker-smokecmd/threat-detect/ CLI entry point
pkg/detector/ Core detection logic (prompt building, result parsing)
pkg/engine/ AI engine abstraction (copilot, claude, codex)
pkg/artifacts/ Artifact reading and validation
pkg/detector/prompts/ Embedded AI prompt template
specs/ W3C-style specification
After containerization, gh-aw references this component via:
const DefaultThreatDetectionRegistry = "ghcr.io/github/gh-aw-threat-detection"
const DefaultThreatDetectionVersion = "v1.0.0"The detection job in compiled workflows uses this container instead of inline AI engine invocation.
gh-aw should also fetch or vendor
releases/threat-detection-lifecycle.json
and evaluate the pinned DefaultThreatDetectionVersion before pulling or
running the detector container. Active versions run normally. Deprecated versions
should emit a GitHub Actions warning annotation and job summary text that include
the reason, replacement version, dates, advisory URL, urgency, and upgrade
instructions, then continue. Obsolete versions should fail closed before the
detector runs and print the same remediation guidance. Unknown versions should
follow the registry policy, currently fail-closed.
See specs/threat-detection-spec.md for the full W3C-style specification.
See CONTRIBUTING.md for development setup and contribution guidelines.
See CODEOWNERS for maintainers.
See SUPPORT.md for help, issue reporting, and support scope.
See CODE_OF_CONDUCT.md.
See SECURITY.md for vulnerability reporting instructions.
See LICENSE for details.