Velocity scorecard.
Move from "we shipped the cadence" to "the cadence shipped value." Measure decision latency, cycle compression, AI leverage ratio, retro yield.
Sample output — Velocity scorecard · quarterly.
Move from "we shipped the cadence" to "the cadence shipped value." Measure decision latency, cycle compression, AI leverage ratio, retro yield.
How it actually goes in.
Pull decision latency baseline.
From the decision log: average time-to-decide per matrix-tracked category. Aggregate. Report average and longest-tail outliers per category.
Capture cycle compression baseline.
Sample of three to five recent operating cycles. Track elapsed time from decision-made to outcome-verified. Aggregate as median cycle time.
Compute AI workflow share baseline.
For each deployed AI workflow, volume handled by AI vs. manual. Express as ratio. Zero if no workflows deployed yet — scorecard tracks growth from there.
Score retro yield baseline.
From the retro decision logs: fraction of surfaced issues producing tactical or structural change. Compute yield as (changes / total surfaced).
Build scorecard + integrate review cadence.
Four-dimension template. Monthly light check (5 min in monthly recalibration). Quarterly deep review (30 min). One-page board version in the quarterly board package.
What good looks like, ninety days in.
Year-over-year drop in average time-to-decide for matrix-tracked decision categories in compounding operations.
Year-over-year drop in elapsed time from decision to outcome verification across primary operating cycles.
Fraction of surfaced issues producing structural change. Climbs from install baseline to maturity across the first year.
Five days to capture baselines and build scorecard template. Quarterly review thereafter.
Why this kit is worth installing.
The Diagnostic Most Operations Stop Running
There is a recognizable pattern in operations that have done the foundation work well. They install the cadence. They install the matrix. They install the dashboards. They run the OKR tree. They survive their first crisis with the system intact. They cross the 12-month mark with a healthier operation than they had before.
Then they stop measuring whether the system is still improving.
The composite Ops Check score plateaus at the post-install level. The category breakdowns hold. The team is operating well. The leadership team has absorbed the system as the new normal. From inside the operation, everything looks fine.
What's happening underneath the plateau is invisible without the Velocity Scorecard. The operating system is holding but not compounding. Decision latency stopped improving 6 months ago. Cycle compression flattened in Q2. AI workflow share is the same it was at install. Retro yield hit 50% and stopped climbing. The plateau is real and structurally costly because the operations that compound past the plateau are pulling away from the ones that don't.
The Velocity Scorecard is the long-cycle measurement that distinguishes operating-system maintenance from operating-system compounding. This essay covers why the distinction matters, what the four dimensions measure, and what to do when the scorecard reveals plateau. The kit guide covers the structural mechanics; this is the operator narrative.
Why Composite Scores Aren't Enough
The Ops Check composite score is the right diagnostic for operations in install or recovery mode. It surfaces the structural gaps and ranks them by cost of delay. It produces the install sequence that closes the gaps.
What the composite score doesn't surface is whether the operation is improving once the gaps are closed. Operations at +1.0 composite can be improving (moving toward +1.5) or plateauing (holding at +1.0 indefinitely) or drifting (heading back toward +0.5). The composite doesn't distinguish.
The Velocity Scorecard distinguishes by measuring the operating system's compounding rate rather than its absolute level. Decision latency dropping quarter-over-quarter means the system is compounding. Decision latency flat means the system is holding. Decision latency rising means the system is drifting.
The same logic applies to the other three dimensions. The scorecard's value is in surfacing the rate of change, not the absolute level. Operations that don't run this measurement are managing to a static snapshot rather than to a trajectory.
The Four Dimensions
The scorecard measures four specific things, each on a quarterly cycle.
Decision latency. Average time-to-decide for decision classes named in the Decision Rights Matrix. Measured by category. The target is reduction quarter-over-quarter — decisions that took 14 days last quarter should take 10 days this quarter as the matrix and cadence muscle compound.
The measurement: for each major decision category, capture the time between "decision was first surfaced" and "decision was committed and documented in the decision record." Aggregate by category. Report the average and the longest-tail outliers.
Operations with healthy compounding see decision latency drop 25-40% over the first year post-install. Operations at plateau see latency hold within 10% of the original baseline. Operations drifting backward see latency rise 15-30% as the cadence loosens.
Cycle compression. Time required to run a complete operating cycle — a working session that produces decisions, has them executed during the week, and tracks outcomes by the next session. The target is compression: cycles that took two weeks to complete in Q1 should complete in 10 days by Q3.
The measurement: for a sample of three to five operating cycles per quarter, track the elapsed time from decision-made to outcome-verified. Aggregate. Trend over time.
Healthy operations compress 20-30% over the first year. The compression compounds because shorter cycles enable more decisions per quarter, which compounds the rate of operating learning.
AI workflow share. The ratio of operating work routed through AI workflows vs. operating work routed manually. The target depends on which workflows have been installed; the right ratio for one operation is wrong for another. What matters is the trend — the share should grow as the AI install matures.
The measurement: for the workflows where AI is deployed, capture the volume of work the AI handled vs. the volume that required human-only handling. Express as a ratio. Trend over time.
Operations that have run the Three-Workflow AI Install typically see this share grow from 0% pre-install to 15-25% by year one post-install. The growth indicates the AI workflows are absorbing the routine work they were designed for; flat AI share post-install indicates the workflows aren't being trusted or aren't being maintained.
Retro yield. Fraction of issues surfaced in weekly retros that produced structural change vs. issues that remained unresolved. The target is rising yield — early in the install, retro yield may be 30-40% (most issues are surfaced and few produce structural change). Mature operations push retro yield above 65%.
The measurement: at the end of each quarter, review the retro decision logs. Count the issues surfaced, the issues that produced tactical change in the next week, and the issues that produced structural change in the quarter. Compute yield as (tactical + structural changes) / total issues surfaced.
Yield rising over multiple quarters indicates the operating team is internalizing what kinds of issues are worth escalating and leadership is internalizing the response discipline. Yield flat indicates the retro is producing the same set of complaints without converting them into structural improvement.
Why Install the Scorecard at Month Six
The kit is calibrated to install six months after the foundational kits land. The timing matters.
Earlier than month six, the scorecard produces too much noise. The cadence is still settling; decision rights are still being internalized; dashboards are still being refined. Movement on the scorecard dimensions reflects install noise rather than operating-system maturity.
Later than month nine, the scorecard misses the early-trend visibility that matters most for intervention. Operations that defer the scorecard install for too long find that the plateau has already set in by the time the scorecard surfaces it; the intervention required is larger than it would have been if the plateau had been caught earlier.
Six months is the right timing because the foundation is stable enough for meaningful measurement, and the operating-system trajectory is still in the early-compounding phase where intervention to maintain the compounding rate is most effective.
The Three Patterns the Scorecard Surfaces
Operations that run the scorecard for multiple quarters see three patterns emerge.
Healthy compounding. All four dimensions improve quarter over quarter. Decision latency drops. Cycle compression deepens. AI workflow share grows. Retro yield climbs. The operating system is doing what it was designed to do; intervention is not required beyond maintenance of the existing disciplines.
Plateau. One or more dimensions flatten over multiple quarters. The operating system is holding but not improving. The intervention depends on which dimension flattened: latency plateau usually traces to cadence loosening; cycle plateau to dashboard refresh discipline degrading; AI share plateau to workflow maintenance gaps; retro yield plateau to leadership escalation response degrading. Each has a specific corrective.
Selective drift. Three dimensions improve and one regresses. Usually retro yield drops as decision latency improves — the team gets faster at executing what they already know how to do but stops surfacing harder, structural issues for change. The pattern is structurally similar to plateau but more pointed; the regression usually indicates that the disciplines on the regressing dimension are being deprioritized in favor of the others. The fix is targeted intervention on the regressing dimension.
The scorecard's value is in distinguishing these patterns early. Operations that don't measure trajectory cannot distinguish plateau from drift from compounding; they manage to a static snapshot that may or may not reflect what's actually happening.
What "Bad" Looks Like
The most expensive Velocity Scorecard pattern is the flat scorecard across all four dimensions. All four dimensions stable, quarter over quarter. The operating system is running, but it has stopped improving.
This is the most common pattern in operations that have done the foundation work, declared victory, and stopped attending to the operating system. The leadership team experiences the operation as "working well" — which it is — and doesn't recognize that the operating-system investment has plateaued.
The fix is structural review. The scorecard surfaces flatness; leadership has to decide whether to invest in the next layer (A3 cycles, AI workflows, portfolio rollout) or whether the operation is at the operational ceiling it can sustain with current resources.
Operations that recognize the plateau and respond with the next layer compound past it. Operations that don't recognize the plateau allow the operating system to drift backward over the subsequent quarters as the discipline gradually erodes.
How to Read the Scorecard With the Board
The Velocity Scorecard is one of the most useful artifacts to include in board or sponsor reporting. It gives the board a structured way to see whether the operating discipline being installed is producing actual operating gain.
The recommended board treatment: include the scorecard as a single page in the quarterly board package. Show the four dimensions, the current quarter's number, the trend over the prior four quarters, and one sentence of context per dimension. The board can absorb the page in two minutes. The page makes operating-system investment visible in the language boards understand — measurement and trend.
Boards that see this page reliably gain confidence in the operating discipline. Boards that don't see it tend to ask softer, more anxious questions about operating health. The scorecard preempts those questions with data.
The compound effect over multiple board cycles is substantial. By the third or fourth quarterly review with the scorecard visible, the board has internalized the trajectory and the questions move from "is the operation healthy" to "how do we accelerate the next dimension." The conversation maturity matches the operating-system maturity.
What to Do at Month Six
If your operating system has been running for six months post-foundation-install, the scorecard install is the next move.
Pull the baseline data per dimension. Decision latency from the decision log. Cycle compression from a sample of three to five recent cycles. AI workflow share from the deployed workflow metrics (zero if no AI yet). Retro yield from the retro decision logs.
Build the scorecard template. Four dimensions. Current baseline. Target for the next quarter. Trend chart for subsequent quarters.
Schedule the monthly check. 5-10 minutes in the monthly recalibration. Reference the running data on each dimension. Surface any dimension that appears to be regressing.
Schedule the quarterly deep review. 30 minutes. Full scorecard presented. Trend over the last four quarters shown. Each dimension reviewed in context.
Install the one-page board version. Single page in the quarterly board package. Four dimensions, current number, four-quarter trend, one sentence of context per.
The kit guide at /playbooks/velocity-scorecard covers the structural detail. This essay is the operator narrative for why the scorecard is the diagnostic that distinguishes operating-system maintenance from operating-system compounding. If you've done the foundation work and want to know whether it's compounding, this is the install.
One week to install. Quarterly review. Four dimensions. The velocity gauge that turns the operating system from a project that was completed into an asset that keeps appreciating.
Sibling kits in the Compounding bundle
Members get the deeper templates.
Member tier is a free account. Adds the deeper templates, install guides, and access to member office hours. Engagement-tier kits stay engagement-only.
Sign in or create account →The Ops Health Check is the front door.
Twelve minutes. Personalized phase-by-phase output. Then come back and pick the kit that matches what came out.
Take the Ops Health Check →