Under GitHub AI Credits the cost levers stop being “how many premium requests did we burn?” and become “which seat is using which model in which mode?”. This page combines your uploaded usage with the published per-token rates and pooling rules to output a concrete optimisation plan: an optimal Business / Enterprise mix, model substitutions ranked by saving, workflow-tier strategy, and a 12-month forecast with adoption ramp. Everything below recomputes live as you move sliders.
Three numbers that frame the rest of the page: today’s baseline AI-Credit run-rate, the run-rate after applying every lever on this page at 100% adoption, and the cumulative 12-month delta if you ramp adoption from 0% → 100% over six months. Numbers refresh as you change scenario inputs below.
Heavy-tail usage is the single biggest reason pooling exists. We classify every active seat in your upload (or the synthetic demo fleet) into four bands by interactions per active day. Each band has a very different optimisation profile — and a very different argument for Business vs Enterprise plan placement.
Under pooling, the per-seat allowance ratio (Business 1,900 c · Enterprise 3,900 c at standard rates; 3,000 c · 7,000 c during the Jun–Aug 2026 promo) and the price differential ($19 vs $39) creates a sweet spot per fleet shape. We sweep every Business-share from 0% to 100% and pick the configuration that covers your projected token cost with zero overage at lowest total monthly bill.
Plan mix sets the size of the entitlement pool. Two more levers decide how that pool actually performs in production: per-user spending limits (so a single runaway agent loop can't drain everyone's allowance) and the org-level overage budget (the hard ceiling the enterprise admin sets above the pool — billed-only-up-to). Both are derived directly from your usage distribution, not generic %s.
Three policies — each balances cap-hit risk (users blocked when they exceed their monthly cap) against headroom (slack the org pays for). Conservative errs toward zero friction and accepts a larger overage budget. Aggressive compresses everyone into the pool with tight caps and a small budget — accepting more cap-hit incidents in exchange for the lowest total cost. Recommended sits in the middle. Pick one to drive the per-class detail and budget recommendation below.
For each class we compute the actual P50, P90, P95 of monthly spend per user and recommend a per-user cap derived from the selected strategy. The grid below also shows the resulting capped pool demand — the worst-case month at these caps — vs the optimum pool from section 3, so you can see headroom at a glance.
The enterprise budget limit is the maximum overage spend GitHub will charge in any month — when it's reached, premium requests stop being billed (they revert to free models or are blocked). Setting it correctly is a stochastic problem: month-to-month spend swings ±20–30% around its trend. We model that as a log-normal multiplier on the projected monthly need, then for each candidate budget compute (a) probability that month exceeds the budget — i.e. you hit the cap and lose access — and (b) expected paid overage. The recommendation is the smallest budget where cap-hit probability ≤ the strategy's target risk.
Each row below is a model in your usage with a credit-equivalent recommendation drawn from the analysis (Performance-per-dollar leaders). The credits-saved column assumes 100% substitution; the waterfall chart caps each swap at 70% adoption (typical pace at which teams actually retrain their reflex model picks).
| Current model | Tier | Interactions | Credits / mo | SWE-bench / $1 | → Recommended | Δ Credits / chat | Saving / mo (70% adopt) |
|---|
Token cost scales linearly with model price but quadratically with workflow size — agent mode sends 15× the tokens of a chat, CLI sends 50×, the cloud agent up to 200×. The optimisation rule from the analysis is simple: cheap models for cheap modes, premium models reserved for synthesis turns only. Below: each Copilot feature in your file, today's average cost, and the “routed” cost if every interaction used the workflow-appropriate model.
Each slider below is an optimisation lever applied as a percentage of the addressable population. Dragging any slider recomputes the headline KPIs above, the model-mix spotlight chart on the right, and the 12-month forecast in §7.
Projects monthly Copilot spend across the next year under three trajectories: baseline (no optimisation), optimised (scenario above with linear adoption ramp), and aggressive (scenario at +30 percentage points adoption on every lever). The shaded period is the introductory pool-bonus window (Jun 1 – Sep 1 2026); after Sep 1 the per-seat allowance steps down, which is what makes the Sep 2026 cliff so prominent on the baseline curve.
A prioritised playbook ranked by absolute monthly saving. Each action is concrete enough to assign an owner.