gh-aw-threat-detection

Threat Detection component for GitHub Agentic Workflows. Analyzes AI agent output for security threats including prompt injection, secret leaks, and malicious patches.

Quick Start

Build the CLI and run it against an artifacts directory:

make build
./bin/threat-detect /path/to/artifacts

Run tests locally:

make test

Overview

This tool runs as a standalone binary or container that analyzes artifacts produced by AI agents before safe outputs are permitted. It supports multiple AI engines (Copilot, Claude, Codex) for detection analysis.

Guardrails and Security Considerations

This project is designed to help reduce risk when running AI agent workflows by inspecting generated artifacts before they are accepted as safe output. Detection is advisory and should be combined with defense-in-depth controls such as least-privilege permissions, human review, and repository protections.

Do not treat a "safe" result as a security guarantee. Use the output as one signal in a broader security review process.

Usage

CLI

threat-detect [flags] <artifacts-dir>

Flags:

--engine — AI engine to use (copilot, claude, codex). Default: copilot
--model — Model override for the engine
--prompt-template — Path to custom prompt template
--output — Path to write JSON result (defaults to stdout)
--version — Print version and exit

Exit codes:

0 — Safe (no threats detected)
1 — Threat detected
2 — Infrastructure/configuration error

Container

docker run --rm \
  -v /path/to/artifacts:/workspace/artifacts \
  ghcr.io/github/gh-aw-threat-detection:latest \
  /workspace/artifacts

The current image is suitable for CLI packaging and smoke testing. Production AI-backed detection requires the selected engine CLI and its authentication to be available in the runtime image or installed by a future image variant.

Input (Artifacts Directory)

<artifacts-dir>/
├── aw-prompts/
│   └── prompt.txt          # Workflow prompt file
├── agent_output.json       # Agent structured output
├── aw-*.patch              # Git format-patch files (optional)
├── aw-*.bundle             # Git bundle files (optional)
└── comment-memory/         # Agent comment memory (optional)
    └── *.md

Output (JSON)

{
  "prompt_injection": false,
  "secret_leak": false,
  "malicious_patch": false,
  "reasons": []
}

Stage Status and Decisions

The extraction staging model is:

Stage 1: standalone repository
Stage 2: containerization
Stage 3: github/gh-aw integration

Stage 1 is functionally represented in this repository. The standalone Go CLI, artifact reader, prompt builder, result parser, engine abstraction, W3C-style specification, unit tests, CI, Dockerfile, and release workflow are present. Remaining work involves integration with github/gh-aw and production hardening of the container runtime in Stage 2/3, not additional JavaScript porting in this repository.

Decisions for the unresolved extraction questions:

JavaScript scripts: detection setup and result parsing are implemented in Go here; the old GitHub Actions JavaScript scripts should not be needed once gh-aw switches to the container contract.
Engine CLIs: the detector invokes the selected engine CLI from PATH and forwards the --model value. Stage 2 should either bundle the supported CLIs into the image or publish an image variant that installs them at runtime before invoking threat-detect.
Custom steps: custom threat-detection.steps remain orchestrator-owned. They should run before or after the container in the gh-aw job rather than being passed into this container as arbitrary scripts.
Backward compatibility: gh-aw should pin a specific image tag and may temporarily keep inline detection behind a compatibility flag until the container path is validated.
Ollama/LlamaGuard: keep this as a custom-step pattern unless a dedicated image variant is explicitly required.
Version coupling: use strict, semver-compatible image tags and have gh-aw pin a specific DefaultThreatDetectionVersion, matching the firewall pattern.
Isolation: the detector should run in the standard detection job initially. Running the detector itself inside an additional firewall/isolation layer can be evaluated later.

Private Container Release Setup

The repository can move through Stage 2 while remaining private. GitHub Container Registry supports private packages, and the release workflow publishes ghcr.io/github/gh-aw-threat-detection:<tag> using the automatic GITHUB_TOKEN with packages: write.

Maintainers need to configure the following before the image is consumed by gh-aw:

Keep Actions enabled for this private repository.
Ensure the package created under ghcr.io/github/gh-aw-threat-detection inherits repository visibility or is explicitly private.
Grant the consuming github/gh-aw repository access to the private package, or configure the organization package settings so GITHUB_TOKEN from gh-aw can pull it with packages: read.
Keep the release-publish and release-promote environments if manual approval is desired; otherwise update the environment protection rules in repository settings.
Tag releases with semantic versions such as v1.0.0. The release workflow publishes the version tag; the promote workflow tags the verified digest as latest.

No additional secrets are required for unit tests, make build, make test, or the container smoke test. Engine authentication is only needed when running real AI-backed detection:

Variable	Required when	Notes
`GH_AW_COPILOT_TOKEN`	Running `--engine copilot` in an environment that needs explicit token-based Copilot authentication	If the Copilot CLI uses device, browser, or host-provided authentication instead, configure that mechanism before running the container.
`ANTHROPIC_API_KEY`	Running `--engine claude` with the Claude CLI	Not used by unit tests.
`OPENAI_API_KEY`	Running `--engine codex` with the Codex CLI	Not used by unit tests.
`WORKFLOW_NAME`	Optional local/container runs	Included in the generated prompt.
`WORKFLOW_DESCRIPTION`	Optional local/container runs	Included in the generated prompt.
`CUSTOM_PROMPT`	Optional local/container runs	Appended to the default detection prompt.

Development

Prerequisites

Go 1.23+
Docker (for container builds)

AW Smoke Workflows

This repository includes three Agentic Workflows smoke tests:

.github/workflows/smoke-copilot.md
.github/workflows/smoke-claude.md
.github/workflows/smoke-codex.md

Each runs daily and by workflow_dispatch. The top-level Smoke workflow can be dispatched manually to start all three compiled smoke workflows and their three containerized siblings. The matching .lock.yml files are the compiled AW workflows. The *-container.lock.yml siblings are generated from those lock files by scripts/create-threat-detection-sibling-workflows.py; they pull the ghcr.io/github/gh-aw-threat-detection container, extract its detector binary, and execute it under the same AWF wrapper used by the generated detection job.

After recompiling the smoke workflows with gh aw compile, regenerate and verify the sibling workflows:

scripts/create-threat-detection-sibling-workflows.py
scripts/create-threat-detection-sibling-workflows.py --check

Configure these Actions secrets to enable all smoke workflows:

Secret	Required for	Notes
`COPILOT_GITHUB_TOKEN`	Copilot smoke workflow and base Copilot detection	See Copilot fallback note below.
`GH_AW_COPILOT_TOKEN`	Optional Copilot token fallback for the container-detection sibling	Used only if `COPILOT_GITHUB_TOKEN` is not configured.
`ANTHROPIC_API_KEY`	Claude smoke workflow and Claude detection	Used by the Claude CLI.
`OPENAI_API_KEY` or `CODEX_API_KEY`	Codex smoke workflow and Codex detection	Configure whichever token your Codex CLI setup expects.
`GH_AW_GITHUB_TOKEN`	Recommended for GitHub MCP access, safe outputs, and private GHCR pulls	The generated workflows fall back to `GITHUB_TOKEN` where possible.
`GH_AW_GITHUB_MCP_SERVER_TOKEN`	Optional GitHub MCP override	Falls back to `GITHUB_TOKEN` in the compiled workflows.

Copilot fallback note: the base workflow uses only secrets.COPILOT_GITHUB_TOKEN. The container-detection Copilot sibling checks secrets.COPILOT_GITHUB_TOKEN, then secrets.GH_AW_COPILOT_TOKEN, then secrets.GH_AW_GITHUB_TOKEN.

Optional Actions variables:

Variable	Purpose
`GH_AW_MODEL_AGENT_COPILOT`, `GH_AW_MODEL_AGENT_CLAUDE`, `GH_AW_MODEL_AGENT_CODEX`	Override the agent model for each smoke workflow.
`GH_AW_MODEL_DETECTION_COPILOT`, `GH_AW_MODEL_DETECTION_CLAUDE`, `GH_AW_MODEL_DETECTION_CODEX`	Override the detection model for each engine.
`GH_AW_THREAT_DETECTION_IMAGE`	Override the detector image used by the `*-container.lock.yml` siblings. Defaults to `ghcr.io/github/gh-aw-threat-detection:latest`.

Build

make build

Test

make test

Lint

make lint

Docker

make docker-build

Run the container smoke test:

make docker-smoke

Architecture

cmd/threat-detect/     CLI entry point
pkg/detector/          Core detection logic (prompt building, result parsing)
pkg/engine/            AI engine abstraction (copilot, claude, codex)
pkg/artifacts/         Artifact reading and validation
pkg/detector/prompts/  Embedded AI prompt template
specs/                 W3C-style specification

Integration with gh-aw

After containerization, gh-aw references this component via:

const DefaultThreatDetectionRegistry = "ghcr.io/github/gh-aw-threat-detection"
const DefaultThreatDetectionVersion  = "v1.0.0"

The detection job in compiled workflows uses this container instead of inline AI engine invocation.

gh-aw should also fetch or vendor releases/threat-detection-lifecycle.json and evaluate the pinned DefaultThreatDetectionVersion before pulling or running the detector container. Active versions run normally. Deprecated versions should emit a GitHub Actions warning annotation and job summary text that include the reason, replacement version, dates, advisory URL, urgency, and upgrade instructions, then continue. Obsolete versions should fail closed before the detector runs and print the same remediation guidance. Unknown versions should follow the registry policy, currently fail-closed.

Specification

See specs/threat-detection-spec.md for the full W3C-style specification.

Contributing

See CONTRIBUTING.md for development setup and contribution guidelines.

Maintainers

See CODEOWNERS for maintainers.

Support

See SUPPORT.md for help, issue reporting, and support scope.

Code of Conduct

See CODE_OF_CONDUCT.md.

Security

See SECURITY.md for vulnerability reporting instructions.

License

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.devcontainer		.devcontainer
.github		.github
cmd/threat-detect		cmd/threat-detect
pkg		pkg
releases		releases
scratchpad		scratchpad
scripts		scripts
skills		skills
specs		specs
.gitattributes		.gitattributes
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEVGUIDE.md		DEVGUIDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
go.mod		go.mod

Folders and files

Latest commit

History

Repository files navigation

gh-aw-threat-detection

Contents

Quick Start

Overview

Guardrails and Security Considerations

Usage

CLI

Container

Input (Artifacts Directory)

Output (JSON)

Stage Status and Decisions

Private Container Release Setup

Development

Prerequisites

AW Smoke Workflows

Build

Test

Lint

Docker

Architecture

Integration with gh-aw

Specification

Contributing

Maintainers

Support

Code of Conduct

Security

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages