Strategy · post-June 1, 2026 · usage-based billing

Maximise value. Minimise tokens. Forecast 12 months out.

Under GitHub AI Credits the cost levers stop being “how many premium requests did we burn?” and become “which seat is using which model in which mode?”. This page combines your uploaded usage with the published per-token rates and pooling rules to output a concrete optimisation plan: an optimal Business / Enterprise mix, model substitutions ranked by saving, workflow-tier strategy, and a 12-month forecast with adoption ramp. Everything below recomputes live as you move sliders.

No upload yet — using demo fleet defaults. Upload yours → Total licensed seats — extras count as dormant in the pool

1Headline — what optimisation is worth

Three numbers that frame the rest of the page: today’s baseline AI-Credit run-rate, the run-rate after applying every lever on this page at 100% adoption, and the cumulative 12-month delta if you ramp adoption from 0% → 100% over six months. Numbers refresh as you change scenario inputs below.

Fleet seats

—

Monthly interactions

—

user-initiated prompts

Baseline AI Credits / mo

—

at current model + workflow mix

Optimised AI Credits / mo

—

scenario sliders applied

Monthly savings

—

vs baseline

12-month savings

—

at full adoption ramp

Loading optimisation engine…

2User-class segmentation — who actually drives the bill

Heavy-tail usage is the single biggest reason pooling exists. We classify every active seat in your upload (or the synthetic demo fleet) into four bands by interactions per active day. Each band has a very different optimisation profile — and a very different argument for Business vs Enterprise plan placement.

Usage distribution — $ / month per seat (sorted descending)

The horizontal red line marks the per-seat Business pool contribution; the green line marks Enterprise. Seats above the line are net-takers from the pool; seats below are net-givers.

Power users Regular Light Dormant

Where each persona spends — feature mix by class

Stacked share of interactions by Copilot feature, per persona. The further right a stack tilts toward agent / CLI, the more tokens that persona burns per interaction.

3Optimal plan mix — Business vs Enterprise

Under pooling, the per-seat allowance ratio (Business 1,900 c · Enterprise 3,900 c at standard rates; 3,000 c · 7,000 c during the Jun–Aug 2026 promo) and the price differential ($19 vs $39) creates a sweet spot per fleet shape. We sweep every Business-share from 0% to 100% and pick the configuration that covers your projected token cost with zero overage at lowest total monthly bill.

Total seats (100)

Pricing period

Power-user share (%) % of seats consume 2.5× per-seat allowance

Power consumption ×

Total monthly cost across the 0% → 100% Business mix sweep

Y-axis = subscription + projected overage. The vertical green dashed line marks the optimum (lowest total). Hover any point to see the breakdown.

Subscription cost Projected overage Total monthly cost

4Per-user quotas & overage budget — protect the pool, protect finance

Plan mix sets the size of the entitlement pool. Two more levers decide how that pool actually performs in production: per-user spending limits (so a single runaway agent loop can't drain everyone's allowance) and the org-level overage budget (the hard ceiling the enterprise admin sets above the pool — billed-only-up-to). Both are derived directly from your usage distribution, not generic %s.

Per-class spending limits — pick a quota strategy

Three policies — each balances cap-hit risk (users blocked when they exceed their monthly cap) against headroom (slack the org pays for). Conservative errs toward zero friction and accepts a larger overage budget. Aggressive compresses everyone into the pool with tight caps and a small budget — accepting more cap-hit incidents in exchange for the lowest total cost. Recommended sits in the middle. Pick one to drive the per-class detail and budget recommendation below.

Per-class detail · recommended strategy

For each class we compute the actual P50, P90, P95 of monthly spend per user and recommend a per-user cap derived from the selected strategy. The grid below also shows the resulting capped pool demand — the worst-case month at these caps — vs the optimum pool from section 3, so you can see headroom at a glance.

Cap effect on the monthly spend distribution

Same per-seat distribution as section 2, but with each persona's recommended cap drawn as a horizontal line. Bars above the line are the spend a cap would ring-fence; the labelled total is the worst-case pool demand at these caps.

Power cap Regular cap Light cap Dormant cap

Enterprise overage budget — the hard monthly ceiling above the pool

The enterprise budget limit is the maximum overage spend GitHub will charge in any month — when it's reached, premium requests stop being billed (they revert to free models or are blocked). Setting it correctly is a stochastic problem: month-to-month spend swings ±20–30% around its trend. We model that as a log-normal multiplier on the projected monthly need, then for each candidate budget compute (a) probability that month exceeds the budget — i.e. you hit the cap and lose access — and (b) expected paid overage. The recommendation is the smallest budget where cap-hit probability ≤ the strategy's target risk.

Strategy target cap-hit risk (5% of months · driven by selected strategy)

Switch strategies above to change this target.

Monthly volatility (σ) (25%)

Apply per-user caps

Pool entitlement

—

$ / month

Projected monthly need

—

at current trend

Recommended monthly budget

—

expected spend / mo

Worst-case month under cap

—

vs uncapped P99 tail

Overage budget vs cap-hit risk and expected spend

X-axis = budget as a multiple of the projected overage. Solid line = expected paid overage at that budget; dashed line = probability the budget is hit (right axis). The vertical green line marks the recommended budget at your target risk.

Expected paid overage / mo Probability of hitting the cap Recommended budget

Computing recommendation…

5Model substitution — same work, fewer credits

Each row below is a model in your usage with a credit-equivalent recommendation drawn from the analysis (Performance-per-dollar leaders). The credits-saved column assumes 100% substitution; the waterfall chart caps each swap at 70% adoption (typical pace at which teams actually retrain their reflex model picks).

Savings waterfall — applying each substitution at 70% adoption

Bars descend left-to-right: starting from baseline credits/mo, each swap shaves the bar; the final green bar is the optimised total.

Per-model substitution table

"% solved per $1" comes from SWE-bench bash-only resolve rate ÷ typical-chat token cost. Swap = recommended replacement (same or better solve rate, lower $).

Current model	Tier	Interactions	Credits / mo	SWE-bench / $1	→ Recommended	Δ Credits / chat	Saving / mo (70% adopt)

6Workflow tier strategy — right model in right mode

Token cost scales linearly with model price but quadratically with workflow size — agent mode sends 15× the tokens of a chat, CLI sends 50×, the cloud agent up to 200×. The optimisation rule from the analysis is simple: cheap models for cheap modes, premium models reserved for synthesis turns only. Below: each Copilot feature in your file, today's average cost, and the “routed” cost if every interaction used the workflow-appropriate model.

Cost per interaction by feature — current vs routed

Each pair of bars compares observed average credits per interaction to the credits a workflow-appropriate model would consume in the same mode. Bigger gap = larger optimisation opportunity.

Current avg credits / interaction Routed (right-model) credits / interaction

Cumulative monthly saving from workflow routing alone

If you sent every interaction in this feature to its routed model, this is the credits-per-month you'd save.

7Scenario builder — pull every lever at once

Each slider below is an optimisation lever applied as a percentage of the addressable population. Dragging any slider recomputes the headline KPIs above, the model-mix spotlight chart on the right, and the 12-month forecast in §7.

Model substitution adoption

70%

% of premium-tier interactions that move to the recommended cheaper-but-equivalent model.

Workflow routing adoption

50%

% of interactions routed to the workflow-appropriate model (Haiku for traversal, Opus for synthesis).

Prompt-cache adoption

40%

% of input tokens served from the cached-input rate (~10% of full price).

Agent-loop tightening

20%

% reduction in tokens-per-agent-turn through summarisation, retrieval scoping, fewer redundant tool calls.

Plan-mix shift toward Enterprise

30%

% of seats moved from Business to Enterprise. Captures more pool headroom for power users at +$20/seat/mo.

Dormant-seat reclamation

50%

% of dormant seats removed at the next billing cycle. Removes both their cost and their pool contribution.

Optimisation lever attribution — credits saved per lever

Stacked horizontal bar showing how each lever contributes to the optimised monthly figure. Levers are computed in the order shown (each operates on the residual after prior ones).

812-month forecast — Apr 2026 → Apr 2027

Projects monthly Copilot spend across the next year under three trajectories: baseline (no optimisation), optimised (scenario above with linear adoption ramp), and aggressive (scenario at +30 percentage points adoption on every lever). The shaded period is the introductory pool-bonus window (Jun 1 – Sep 1 2026); after Sep 1 the per-seat allowance steps down, which is what makes the Sep 2026 cliff so prominent on the baseline curve.

Seat growth / mo

+2.0%

Compound monthly seat growth applied to the fleet base.

Usage growth / seat / mo

+3.0%

Per-seat tokens grow as agent adoption deepens (negative if you expect maturation).

Optimisation ramp (months to full adoption)

6 mo

Linear ramp from 0% → scenario-slider values over this many months.

Aggressive uplift

+30pp

Percentage points added to every lever in the “aggressive” trajectory.

Monthly cost — three trajectories over 12 months

Stacked area for each scenario. Hover any month for the dollar figure and a breakdown by sub-component.

Baseline (do nothing) Optimised (your scenario) Aggressive Promo bonus window

Cumulative 12-month savings vs baseline

Area between Baseline and Optimised lines, summed month by month — your run-rate optimisation budget.

12-mo baseline spend

—

do-nothing trajectory

12-mo optimised spend

—

at your scenario

12-mo savings

—

vs baseline

Aggressive 12-mo savings

—

at +30pp adoption

Maximise value. Minimise tokens. Forecast 12 months out.

1Headline — what optimisation is worth

2User-class segmentation — who actually drives the bill

3Optimal plan mix — Business vs Enterprise

4Per-user quotas & overage budget — protect the pool, protect finance

Per-class spending limits — pick a quota strategy

Per-class detail · recommended strategy

Enterprise overage budget — the hard monthly ceiling above the pool

5Model substitution — same work, fewer credits

6Workflow tier strategy — right model in right mode

7Scenario builder — pull every lever at once

812-month forecast — Apr 2026 → Apr 2027

9Top actions — sized to your data