After this chapter you can operate APM across a whole organization — not one repo, but a
fleet. You will wire apm audit --ci into branch protection as a
required, unbypassable pull-request check (the same integrity gate from
Chapter 7 and policy gate from
Chapter 9, re-run in CI as defence in depth), stand it up
with microsoft/apm-action, route dependency traffic through an approved
registry proxy (or build for a fully air-gapped network), publish and own an
org apm-policy.yml baseline, and roll it all out with a measured adoption playbook. This is
the capstone: it introduces no new property. Instead it asks whether the four you already know —
Portability,
Reproducibility,
Provenance / security, and above all
Governance — hold for every repo,
enforced and operated at scale. Governance is the star of the chapter: the org’s
third promise is that “every AI package your developers install is governed by org policy before it
touches disk” (Enterprise
overview). One honesty note up front, at apm v0.23.1: the
apm audit --ci baseline is stable, but three surfaces you will meet here are
preview — the --policy flag is experimental, the dedicated
registry (distinct from the proxy) is experimental behind a feature gate, and policy enforcement
is early preview — so pin the CLI (and microsoft/apm-action, latest
v1.10.0) before you lean on any of them as a production gate.
Objective
Concept/Theory
From “can this repo install?” to “can every repo install?”
A single team gets almost all of APM’s value from three files and one habit:
apm.yml, apm.lock.yaml, and apm install. Everything in
Chapters 5–9
answered a repo-sized question: does my project install, reproduce, stay current, stay safe, and
stay within policy? An enterprise asks a different, larger question — the shift this whole
chapter turns on: not “can this repo install?” but “can
every repo install safely and predictably, without the platform team in the loop?”
That reframing surfaces five things the single-team story never forces you to build (Making the case):
- Repeatable rollout — a runbook any team follows, not a hero who hand-holds each repo.
- Policy ownership — a named, protected owner of the org’s
apm-policy.yml. - Audit gates — a required CI check on every pull request, authoritative regardless of what a developer does locally.
- Registry strategy — controlled, mirrored, and (when needed) offline dependency traffic.
- Exception handling — a visible, reviewable way to grant a waiver.
The docs’ worked figure is a mid-to-large org: 50 repositories, 200 developers, five AI coding tools. Without central management a predictable failure set emerges — manual config drifting per repo, no audit trail (“what agent configuration was active at release 4.2.1?” has no answer), version drift between developers and CI, onboarding friction, and ungoverned dependencies: “the same problem regulated industries spent a decade solving for application code, now back in a new form” (Making the case). These are the Chapter 1 pains, multiplied by the repo count.
Two boundaries keep the rest of the chapter honest. First, consuming policy is not owning
policy: a fleet team (including the original pilot) consumes the org policy discovered
from its git remote — authorship lives in <org>/.github behind CODEOWNERS and
branch protection, exactly the org-remote model from
Chapter 1 and
Chapter 9. Second, nothing here is a new
capability: fleet scale is Chapters 6–10 made repeatable, owned, gated, and
measured. The common trap is to think “fleet scale is just the single-repo setup, copied
N times” — but N copies without central ownership simply reproduce the drift problem at
higher cost. The missing pieces are ownership, gates, a registry strategy, and exceptions, none of which
a single repo ever needs.
In APM
The fleet gate: apm audit --ci
The authoritative enforcer at fleet scale is one command run on every pull request:
apm audit --ci. It is not a new engine — it is the same gate you already
know, re-run in CI as defence in depth. On a PR it runs the eight baseline
lockfile checks (lockfile-exists, ref-consistency,
deployed-files-present, no-orphaned-packages,
skill-subset-consistency, config-consistency, content-integrity,
includes-consent), the install-replay drift check, and — if
an apm-policy.yml is discovered from the git remote — the org
Governance checks. Exit code is 0
clean, 1 on any violation
(Enforce in CI).
Why re-run a gate the developer already passed on their machine? Because the install-time gate from
Chapter 8 and
Chapter 9 protects only the developer’s own disk, and
it is locally bypassable: a developer can pass --no-policy,
--force, or set APM_POLICY_DISABLE=1. CI re-runs the identical checks on the
pull request itself — and “--no-policy does not work here — CI ignores the
local bypass flag” (Enforce
in CI). Wired into GitHub Rulesets as a required status check, a violating PR simply
cannot merge (GitHub
rulesets). That is what makes an org rule actually authoritative on every merge across every
repo.
| Flag | Job | Fleet note |
|---|---|---|
--policy <src> |
Explicit policy ref: org | owner/repo | https://… | local path |
[experimental]. Omit it and APM auto-discovers from the git remote, like apm install. |
--no-cache |
Force a fresh policy fetch | Recommended in CI — a cached policy file must not mask a same-day org update. |
--no-policy |
Skip policy discovery (baseline + drift only) | Not a bypass of the org gate — CI wires the unflagged command; baseline is never bypassable. |
--no-fail-fast |
Run every check even after one fails | Use for full reports and drift sweeps; the default stops at the first failure. |
-f sarif|json, -o <path> |
Structured output; write to a file (format inferred from extension) | SARIF feeds Code Scanning. Markdown is not supported in --ci mode. |
Two properties of the gate are worth holding onto. The eight baseline checks plus drift always
run and are never bypassable — that is the tie-back to
Chapter 6:
Reproducibility is enforced in CI even when
Governance is off. And there is no per-PR
override flag, by design. Exceptions are visible or they do not exist: you either amend
<org>/.github/apm-policy.yml through normal review (allow-list the package, raise a
cap) or lower enforcement from block to warn for that scope
— findings still appear in SARIF, they just stop failing the job. “Bypass must be visible in
the policy file’s history”
(Enforce in CI).
That is the exception handling the concept promised, made concrete.
microsoft/apm-action: the turnkey CI path
microsoft/apm-action is the convenience wrapper that installs the CLI, runs
apm install, and can emit SARIF for Code Scanning
(microsoft/apm-action). It is
not the enforcer — the gate is apm audit --ci; the action just stands it up.
Pin the action to the major tag @v1 and pin the CLI with apm-version: for
reproducible runs. A GitHub Action cannot be executed locally, so every workflow below is
SKIPPED-needs-network — documented, not run — while the
underlying apm audit --ci gate it invokes is verified (see the worked example).
backend/examples/ch11/workflows/apm-audit.yml — the minimal required-check gate.
Made a required status check via GitHub Rulesets, a violating PR cannot merge.
needs networkapm v0.23.1
# .github/workflows/apm-audit.yml -- SKIPPED-needs-network (the `apm audit --ci` it runs IS verified)
name: APM audit
on:
pull_request:
paths: ['apm.yml', 'apm.lock.yaml', '.apm/**', '.github/**', '.claude/**', '.cursor/**']
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: microsoft/apm-action@v1 # pin to the major tag; installs CLI + runs `apm install`
with:
apm-version: '0.23.1' # pin the CLI for reproducible CI
- run: apm audit --ci --no-cache # verified gate: exit 0 clean / 1 on violation
env:
GITHUB_APM_PAT: ${{ secrets.APM_PAT }} # same-org private repos work with zero config
For findings to appear inline on the PR diff and in the repo’s Security tab, emit SARIF and upload
it. The if: always() step is load-bearing — SARIF must upload even when the audit
exits 1, or the failing run produces no Code Scanning entry:
backend/examples/ch11/workflows/apm-audit-sarif.yml — SARIF for Code Scanning.
needs networkapm v0.23.1
# .github/workflows/apm-audit-sarif.yml -- SKIPPED-needs-network (documented)
jobs:
audit:
runs-on: ubuntu-latest
permissions:
contents: read
security-events: write # required to upload SARIF to Code Scanning
steps:
- uses: actions/checkout@v4
- uses: microsoft/apm-action@v1
with: { apm-version: '0.23.1' }
- name: Audit
run: apm audit --ci --no-cache -o apm-audit.sarif # format inferred from the .sarif extension
env: { GITHUB_APM_PAT: '${{ secrets.APM_PAT }}' }
- name: Upload SARIF
if: always() # upload even on exit 1, or a failing run has no alert
uses: github/codeql-action/upload-sarif@v3
with: { sarif_file: apm-audit.sarif, category: apm-audit }
Two refinements are worth knowing. The default action runs apm install first, which
overwrites managed files and can hide bytes that were hand-edited after the last install;
setup-only: true puts the CLI on PATH only, so the committed bytes are audited as ground
truth (content-integrity still verifies each file’s SHA-256 against the lockfile). And
because apm audit --ci is a plain CLI call, the same gate is vendor-neutral
— it runs in Azure Pipelines, GitLab CI, and Jenkins, not only GitHub Actions
(CI/CD integration);
microsoft/apm-action is just the GitHub-shaped convenience.
Registry strategy: proxy and air-gapped
Because “enterprise networks rarely allow agents to reach github.com directly,”
APM routes dependency traffic through composable, environment-variable controls
(Registry
proxy). The single most important verified fact: these are environment variables, not
apm config keys. At v0.23.1 apm config persists only
auto-integrate, temp-dir, and mcp-registry-url — so at fleet
scale you pin the proxy settings in CI secrets, dev-container env, and shell profiles, “in the same
place you pin Python and APM versions,” not via apm config set.
| Need | Knob | Note |
|---|---|---|
| Allow outbound traffic at the firewall | HTTPS_PROXY / HTTP_PROXY / NO_PROXY |
Standard forward-proxy vars — “if git clone works through the proxy, apm install works too.” |
| Mirror every archive for audit + replay | PROXY_REGISTRY_URL |
Rewrites every GitHub-hosted download to fetch via the mirror (e.g. an Artifactory GitHub remote). |
| Authenticate to the mirror | PROXY_REGISTRY_TOKEN |
Bearer token on proxy requests; independent of GITHUB_APM_PAT. |
| Refuse direct-VCS fallback (mandatory + auditable) | PROXY_REGISTRY_ONLY=1 |
APM refuses to fall back to github.com; the lockfile records a registry_prefix, and replay aborts on a directly-pinned entry until re-resolved. |
Silence the plaintext-token warning on http:// |
PROXY_REGISTRY_ALLOW_HTTP=1 |
Use only inside an isolated network. (Deprecated ARTIFACTORY_* aliases still work with a warning.) |
| Fully air-gapped CI (no egress at all) | apm pack on a connected host → restore offline |
Chapter 10’s bundle, reused as an air-gapped delivery mechanism. |
The reproducibility tie-back matters here more than anywhere: the proxy is not the trust anchor
— the lockfile is. “Every install verifies the content_hash recorded in
apm.lock.yaml regardless of where the bytes came from. A tampered proxy that rewrites archive
contents is caught by the lockfile guard, not the cache”
(Registry proxy).
So routing through Artifactory does not weaken
Reproducibility; if anything
PROXY_REGISTRY_ONLY=1 makes the proxy path mandatory and auditable. Be honest about
coverage, though: the proxy covers apm install for GitHub-hosted deps and marketplace
fetches, but not Azure DevOps deps, not MCP servers (a separate
registry), and not the apm-policy.yml fetch (which uses the GitHub API
directly). And do not confuse the proxy with the dedicated registry: the proxy fronts an upstream
git host and is stable; the dedicated registry is a separate, additive package source with no git upstream
and is experimental (v0.2 API). APM’s distribution is git-based
today — there is no npm-style central registry.
Org policy at fleet scale
The mechanism of org policy is already fully covered in
Chapter 9 — the schema, the warn
→ block dial, tighten-only inheritance. Fleet scale changes only where it lives
and who owns it: the authoritative policy sits in <org>/.github/apm-policy.yml
behind CODEOWNERS and branch protection, and every consuming repo (the pilot included) only
consumes it, discovered from its git remote. Past the pilot, the org file is
enforcement: block with fetch_failure: block so a repo whose policy cannot be
fetched fails closed rather than silently fail-opening. Two parts of this are exactly the
parts that need infrastructure a reader’s sandbox will not have — org-remote
discovery and the tighten-only extends: merge — so both are
SKIPPED-needs-network here (recall from Chapter 9 that a local-file
extends: does not merge a parent; real inheritance needs an org /
owner/repo / https:// ref). You author and test the file locally, then land it in
<org>/.github.
When to use / pitfalls
Roll out in phases, measure by leading indicators
Rolling APM out to a fleet is a staged program, not a switch. The official adoption playbook is five phases, each with a single owner, a deliverable, and a gate to clear before advancing (Adoption playbook). The outline’s four-word mnemonic — pilot → measure → standardize → gate — maps directly onto them:
| Phase | Owner | Gate to advance |
|---|---|---|
| Discover | Platform team | Shadow apm install --dry-run + apm audit on representative repos; answer “what breaks if we turn this on tomorrow, and for whom?” |
| Pilot | One product team + platform | Manifest, lockfile, CI audit, and policy in warn; two consecutive weeks of clean pilot CI with every warning triaged. |
| Harden | Security + platform | Flip warn → block, add the registry proxy, stand up marketplaces; a fresh repo installs against org policy + proxy with no manual help. |
| Scale | Product teams (self-service) | Platform is no longer in the critical path; new repos onboard from a checklist. |
| Sustain | A named on-call | Steady state: weekly drift triage, monthly lockfile review, quarterly marketplace refresh. |
Success is measured by leading indicators, not package counts. The docs are explicit
that “measuring apm.yml count and nothing else” is a vanity
metric — a repo with a manifest but a failing audit or rising drift is not adopted, it is
at risk. Watch audit pass rate, the drift trend (findings closed vs. opened),
and marketplace uptake instead
(Adoption
playbook).
Worked example
Meridian’s four moves, in order. Only the first is executable in a reader’s sandbox —
the CI gate itself — so it is shown running against a local project (offline, one public package,
no tokens). The proxy, the workflow, and the org-policy merge each need infrastructure and are marked
SKIPPED-needs-network.
Move 1 — the required gate (RUNNABLE offline)
A clean project — one pinned public dependency plus its committed lockfile — passes
the gate. The clean run is the eight baseline checks plus the drift check: nine checks,
exit 0. (The Could not determine org… line is benign: a scratch
directory has no git remote, so org auto-discovery is skipped.)
apm audit --ci on the clean project — nine checks pass, exit 0. This is the
gate the required PR check runs. Replays from cache and scans locally, so it runs offline. Transcript
abbreviated for space.
apm v0.23.1
$ apm audit --ci
[!] Could not determine org from git remote; enforcement skipped (set policy.fetch_failure_default=block in apm.yml to fail closed)
[>] Replaying install (cache-only)... [+] Replayed 2 package(s) [+] No drift detected
[>] APM Policy Compliance
│ [+] │ lockfile-exists │ Lockfile present │
│ [+] │ ref-consistency │ All dependency refs match lockfile │
│ [+] │ deployed-files-present │ All deployed files present on disk │
│ [+] │ no-orphaned-packages │ No orphaned packages in lockfile │
│ [+] │ skill-subset-consistency │ Skill subset selections match lockfile │
│ [+] │ config-consistency │ No MCP configs to check │
│ [+] │ content-integrity │ No critical hidden Unicode or hash drift detected │
│ [+] │ includes-consent │ No local content deployed -- includes … skipped │
│ [+] │ drift │ no drift detected against lockfile │
[*] All 9 check(s) passed # exit 0 -- 8 baseline checks + drift
Now the case that makes the required check authoritative: a block policy
(enforcement: block, require_pinned_constraint: true) against an
unpinned direct dependency on a bare #main branch. The
dependency-pinned-constraint check fails and the gate exits 1 — a
required PR check would block the merge, and no local --no-policy can rescue it in CI:
apm audit --ci --policy ./pol-block.yml when a direct dependency tracks a bare branch
— the check fails, exit 1. Runs offline against a local project + policy file. Transcript
abbreviated for space.
apm v0.23.1
$ apm audit --ci --policy ./pol-block.yml # enforcement: block, require_pinned_constraint: true
[>] Replaying install (cache-only)... [+] No drift detected
... # baseline + policy checks run
│ │ dependency-pinned-constraint │ 1 dependency(ies) use unbounded constraints
│ │ │ (hint: pin to a semver range, literal tag, or SHA) │
│ │ │ - microsoft/apm-sample-package: bare branch 'main' tracks a moving tip │
[x] 1 of 18 check(s) failed # exit 1 -- the required check blocks the PR
The same block policy against a pinned direct dep — even one that pulls an unpinned
transitive — passes, because require_pinned_constraint is direct-only. Note
the higher check count: with no failure, fail-fast never trips, so every check is enumerated:
#v1.0.0) — the unpinned transitive does
not trip the direct-only rule; all checks pass, exit 0. Runs offline.
apm v0.23.1
$ apm audit --ci --policy ./pol-block.yml # DIRECT dep pinned to #v1.0.0; a transitive dep is unpinned
│ [+] │ dependency-pinned-constraint │ All dependencies use pinned constraints │
...
[*] All 29 check(s) passed # exit 0 -- direct-only rule; count higher because nothing fails
# (fail-fast never trips; use --no-fail-fast for a full report)
Move 2 — make it required in CI (SKIPPED-needs-network)
Meridian stands the verified gate up on every PR in all three product groups with
microsoft/apm-action, then makes the job a required status check via GitHub
Rulesets so a violating PR cannot merge. The workflow is the one shown earlier
(backend/examples/ch11/workflows/apm-audit.yml); the SARIF variant surfaces each finding on
the PR diff. Both are documented, not run — but the apm audit --ci they invoke is the
command proven in Move 1.
Move 3 — route traffic through the approved proxy (SKIPPED-needs-network)
Meridian’s security org already mandates Artifactory for npm and PyPI, so APM joins the same
operating model. The platform team pins these environment variables in CI secrets and the shared dev
container — not in apm config — so the whole fleet resolves identically:
apm config keys. Needs an Artifactory / private host to run.
needs networkapm v0.23.1
# SKIPPED-needs-network: pin in CI secrets / dev-container env, not `apm config`.
export HTTPS_PROXY="http://proxy.meridian.example:8080"
export PROXY_REGISTRY_URL="https://artifactory.meridian.example/artifactory/github-remote"
export PROXY_REGISTRY_TOKEN="$ARTIFACTORY_TOKEN" # bearer token; independent of GITHUB_APM_PAT
export PROXY_REGISTRY_ONLY=1 # refuse direct github.com fallback (mandatory + auditable)
apm install # every archive fetched via the mirror;
# content_hash from apm.lock.yaml still verified (Ch6)
A fully air-gapped group would instead receive a pre-built bundle: apm pack on a connected
host (Chapter 10), restored offline. Either way, integrity is
anchored to the lockfile’s content_hash, so the mirror is a routing and audit
convenience, never a new trust anchor.
Move 4 — publish and own the org baseline (SKIPPED-needs-network)
Finally the platform team lands the org apm-policy.yml in meridian-finance/.github,
behind branch protection. This is the Chapter 9 schema,
relocated and set to fail closed; the three product-group repos consume it from their git remote.
Authoring and testing happen locally; the org-remote discovery and tighten-only extends:
merge that make it fleet-wide need infrastructure, so they are SKIPPED-needs-network:
meridian-finance/.github/apm-policy.yml — the fleet baseline, owned behind CODEOWNERS +
branch protection; consuming repos only consume it. Org-remote discovery + extends: merge
need infrastructure. needs networkapm v0.23.1
# meridian-finance/.github/apm-policy.yml -- SKIPPED-needs-network (org-remote discovery + extends: merge)
name: meridian-org-baseline
version: "1.0.0"
enforcement: block # past the pilot: fail closed on violations
fetch_failure: block # if the policy can't be fetched, fail closed (do not fail-open)
dependencies:
require_pinned_constraint: true # fires on a bare-branch DIRECT dep (Move 1) -- direct-only
allow: # only these sources (deny still wins)
- meridian-finance/**
- microsoft/**
- github/awesome-copilot/**
deny:
- sketchy-org/**
compilation:
target:
allow: [copilot, claude, cursor] # target rules live HERE, not a top-level `targets:` (Ch9)
With the baseline in warn first, Meridian watches Code Scanning across the three groups for
two sprints, remediates the top offenders, then flips to block — the same
measured rollout as Chapter 9, now fleet-wide, with the
pilot repo as the canary. Their status report to leadership is audit pass rate and drift
trend, not “repos with a manifest.”
Recap & next
Recap
- The question changes at scale. From “can this repo install?” to “can every repo install safely and predictably?” A fleet needs five things one team never forces: repeatable rollout, policy ownership, audit gates, a registry strategy, and exception handling. This is the capstone for Governance — enforced and operated across the org.
-
The fleet gate is
apm audit --ci. The same integrity gate (Chapter 7) and policy gate (Chapter 9), re-run in CI as defence in depth: eight baseline checks + drift (+ discovered policy), exit0/1. Made required and unbypassable via GitHub Rulesets — local--no-policydoes not work in CI. Stand it up withmicrosoft/apm-action@v1; it is vendor-neutral. -
Registry strategy is environment-variable driven.
HTTPS_PROXY+PROXY_REGISTRY_URL+PROXY_REGISTRY_ONLY=1route and mirror traffic;apm packserves air-gapped networks. Integrity stays anchored to the lockfilecontent_hash(Reproducibility), not the proxy — the proxy is never the trust anchor. - Adoption is change management. Discover → Pilot → Harden → Scale → Sustain, each with an owner, a deliverable, and a gate. Measure by leading indicators — audit pass rate, drift trend, marketplace uptake, setup time — not by manifest count, which is a vanity metric. Carrot before stick; a named on-call, never “the platform team.”
-
Mind the preview edges at apm v0.23.1.
require_pinned_constraintis direct-only; the baseline is stable and unbypassable but--policyis experimental; the proxy knobs are env-vars, notapm config; and never claim a locally-run Action result.
Next
You have now consumed, locked, maintained, secured, governed, produced, and operated agent context across a fleet — the full arc from one repo to an organization. Chapter 12 — The Landscape & What’s Next steps back to place APM among the standards it builds on (AGENTS.md, Agent Skills, MCP, OpenAPM v0.1) and the roadmap ahead — including the dedicated registry API this chapter flagged as experimental — so you can decide what to adopt, watch, or build around.