Optimisation playbook From June 1, 2026 — every token has a price tag. Tune your plan mix, model selection, and workflow strategy for usage-based billing. Live scenario modelling and a 12-month forecast below. Pooling docs.
Strategy · post-June 1, 2026 · usage-based billing

Maximise value. Minimise tokens. Forecast 12 months out.

Under GitHub AI Credits the cost levers stop being “how many premium requests did we burn?” and become “which seat is using which model in which mode?”. This page combines your uploaded usage with the published per-token rates and pooling rules to output a concrete optimisation plan: an optimal Business / Enterprise mix, model substitutions ranked by saving, workflow-tier strategy, and a 12-month forecast with adoption ramp. Everything below recomputes live as you move sliders.

No upload yet — using demo fleet defaults. Upload yours → — extras count as dormant in the pool

1Headline — what optimisation is worth

Three numbers that frame the rest of the page: today’s baseline AI-Credit run-rate, the run-rate after applying every lever on this page at 100% adoption, and the cumulative 12-month delta if you ramp adoption from 0% → 100% over six months. Numbers refresh as you change scenario inputs below.

Fleet seats
Monthly interactions
user-initiated prompts
Baseline AI Credits / mo
at current model + workflow mix
Optimised AI Credits / mo
scenario sliders applied
Monthly savings
vs baseline
12-month savings
at full adoption ramp
Loading optimisation engine…

2User-class segmentation — who actually drives the bill

Heavy-tail usage is the single biggest reason pooling exists. We classify every active seat in your upload (or the synthetic demo fleet) into four bands by interactions per active day. Each band has a very different optimisation profile — and a very different argument for Business vs Enterprise plan placement.

Usage distribution — $ / month per seat (sorted descending)
The horizontal red line marks the per-seat Business pool contribution; the green line marks Enterprise. Seats above the line are net-takers from the pool; seats below are net-givers.
Power users Regular Light Dormant
Where each persona spends — feature mix by class
Stacked share of interactions by Copilot feature, per persona. The further right a stack tilts toward agent / CLI, the more tokens that persona burns per interaction.

3Optimal plan mix — Business vs Enterprise

Under pooling, the per-seat allowance ratio (Business 1,900 c · Enterprise 3,900 c at standard rates; 3,000 c · 7,000 c during the Jun–Aug 2026 promo) and the price differential ($19 vs $39) creates a sweet spot per fleet shape. We sweep every Business-share from 0% to 100% and pick the configuration that covers your projected token cost with zero overage at lowest total monthly bill.

20% of seats consume 2.5× per-seat allowance
Total monthly cost across the 0% → 100% Business mix sweep
Y-axis = subscription + projected overage. The vertical green dashed line marks the optimum (lowest total). Hover any point to see the breakdown.
Subscription cost Projected overage Total monthly cost

4Per-user quotas & overage budget — protect the pool, protect finance

Plan mix sets the size of the entitlement pool. Two more levers decide how that pool actually performs in production: per-user spending limits (so a single runaway agent loop can't drain everyone's allowance) and the org-level overage budget (the hard ceiling the enterprise admin sets above the pool — billed-only-up-to). Both are derived directly from your usage distribution, not generic %s.

Per-class spending limits — pick a quota strategy

Three policies — each balances cap-hit risk (users blocked when they exceed their monthly cap) against headroom (slack the org pays for). Conservative errs toward zero friction and accepts a larger overage budget. Aggressive compresses everyone into the pool with tight caps and a small budget — accepting more cap-hit incidents in exchange for the lowest total cost. Recommended sits in the middle. Pick one to drive the per-class detail and budget recommendation below.

Per-class detail · recommended strategy

For each class we compute the actual P50, P90, P95 of monthly spend per user and recommend a per-user cap derived from the selected strategy. The grid below also shows the resulting capped pool demand — the worst-case month at these caps — vs the optimum pool from section 3, so you can see headroom at a glance.

Cap effect on the monthly spend distribution
Same per-seat distribution as section 2, but with each persona's recommended cap drawn as a horizontal line. Bars above the line are the spend a cap would ring-fence; the labelled total is the worst-case pool demand at these caps.
Power cap Regular cap Light cap Dormant cap

Enterprise overage budget — the hard monthly ceiling above the pool

The enterprise budget limit is the maximum overage spend GitHub will charge in any month — when it's reached, premium requests stop being billed (they revert to free models or are blocked). Setting it correctly is a stochastic problem: month-to-month spend swings ±20–30% around its trend. We model that as a log-normal multiplier on the projected monthly need, then for each candidate budget compute (a) probability that month exceeds the budget — i.e. you hit the cap and lose access — and (b) expected paid overage. The recommendation is the smallest budget where cap-hit probability ≤ the strategy's target risk.

Switch strategies above to change this target.
Pool entitlement
$ / month
Projected monthly need
at current trend
Recommended monthly budget
expected spend / mo
Worst-case month under cap
vs uncapped P99 tail
Overage budget vs cap-hit risk and expected spend
X-axis = budget as a multiple of the projected overage. Solid line = expected paid overage at that budget; dashed line = probability the budget is hit (right axis). The vertical green line marks the recommended budget at your target risk.
Expected paid overage / mo Probability of hitting the cap Recommended budget
Computing recommendation…

5Model substitution — same work, fewer credits

Each row below is a model in your usage with a credit-equivalent recommendation drawn from the analysis (Performance-per-dollar leaders). The credits-saved column assumes 100% substitution; the waterfall chart caps each swap at 70% adoption (typical pace at which teams actually retrain their reflex model picks).

Savings waterfall — applying each substitution at 70% adoption
Bars descend left-to-right: starting from baseline credits/mo, each swap shaves the bar; the final green bar is the optimised total.
Per-model substitution table
"% solved per $1" comes from SWE-bench bash-only resolve rate ÷ typical-chat token cost. Swap = recommended replacement (same or better solve rate, lower $).
Current model Tier Interactions Credits / mo SWE-bench / $1 → Recommended Δ Credits / chat Saving / mo (70% adopt)

6Workflow tier strategy — right model in right mode

Token cost scales linearly with model price but quadratically with workflow size — agent mode sends 15× the tokens of a chat, CLI sends 50×, the cloud agent up to 200×. The optimisation rule from the analysis is simple: cheap models for cheap modes, premium models reserved for synthesis turns only. Below: each Copilot feature in your file, today's average cost, and the “routed” cost if every interaction used the workflow-appropriate model.

Cost per interaction by feature — current vs routed
Each pair of bars compares observed average credits per interaction to the credits a workflow-appropriate model would consume in the same mode. Bigger gap = larger optimisation opportunity.
Current avg credits / interaction Routed (right-model) credits / interaction
Cumulative monthly saving from workflow routing alone
If you sent every interaction in this feature to its routed model, this is the credits-per-month you'd save.

7Scenario builder — pull every lever at once

Each slider below is an optimisation lever applied as a percentage of the addressable population. Dragging any slider recomputes the headline KPIs above, the model-mix spotlight chart on the right, and the 12-month forecast in §7.

70%
% of premium-tier interactions that move to the recommended cheaper-but-equivalent model.
50%
% of interactions routed to the workflow-appropriate model (Haiku for traversal, Opus for synthesis).
40%
% of input tokens served from the cached-input rate (~10% of full price).
20%
% reduction in tokens-per-agent-turn through summarisation, retrieval scoping, fewer redundant tool calls.
30%
% of seats moved from Business to Enterprise. Captures more pool headroom for power users at +$20/seat/mo.
50%
% of dormant seats removed at the next billing cycle. Removes both their cost and their pool contribution.
Optimisation lever attribution — credits saved per lever
Stacked horizontal bar showing how each lever contributes to the optimised monthly figure. Levers are computed in the order shown (each operates on the residual after prior ones).

812-month forecast — Apr 2026 → Apr 2027

Projects monthly Copilot spend across the next year under three trajectories: baseline (no optimisation), optimised (scenario above with linear adoption ramp), and aggressive (scenario at +30 percentage points adoption on every lever). The shaded period is the introductory pool-bonus window (Jun 1 – Sep 1 2026); after Sep 1 the per-seat allowance steps down, which is what makes the Sep 2026 cliff so prominent on the baseline curve.

+2.0%
Compound monthly seat growth applied to the fleet base.
+3.0%
Per-seat tokens grow as agent adoption deepens (negative if you expect maturation).
6 mo
Linear ramp from 0% → scenario-slider values over this many months.
+30pp
Percentage points added to every lever in the “aggressive” trajectory.
Monthly cost — three trajectories over 12 months
Stacked area for each scenario. Hover any month for the dollar figure and a breakdown by sub-component.
Baseline (do nothing) Optimised (your scenario) Aggressive Promo bonus window
Cumulative 12-month savings vs baseline
Area between Baseline and Optimised lines, summed month by month — your run-rate optimisation budget.
12-mo baseline spend
do-nothing trajectory
12-mo optimised spend
at your scenario
12-mo savings
vs baseline
Aggressive 12-mo savings
at +30pp adoption

9Top actions — sized to your data

A prioritised playbook ranked by absolute monthly saving. Each action is concrete enough to assign an owner.