UgrĂĄs a tartalomhoz
← Back to the journal

Consulting OS pricing in 2026 — what we learned from four quarters of experiments

Why per-seat dies on AI, the 3 pricing experiments that failed, and the model that's held 4 quarters. Core seat + bundle + overage.

Pricing in 2026

Per-seat is dying for AI-heavy SaaS products. This isn't a clickbait take — it's the summary of four quarters of Consulting OS pricing experiments and the real data we have. Here's what we believe, what we tried, what failed, what stuck, and what NRR numbers we got.

Why per-seat dies on AI

Classic SaaS per-seat assumes a user's monthly cost is roughly constant. A senior consultant actively uses Consulting OS for 5-6 hours a day, a junior glances at it 2-3 times a week. The machine cost (DB queries, frontend loads, log storage) of these two really was in the same order of magnitude.

With AI it isn't. The active senior initiates 40-60 chat interactions per day, loads documents, generates ADR drafts, synthesises audit findings. At the token layer that's 50,000-80,000 tokens per day. The junior who glances 2x a week is 200-500 tokens per day. A 100x difference. On per-seat pricing they pay the same monthly fee. The AI margin disappears.

Concretely: in Q1 2026, the top-10 users across Consulting OS tenants generated 38% of the entire engine bill. Everyone paid the same.

Where we moved: core seat + token bundle + overage

What we run now across 3 product lines (admin, finance, consulting-os):

  1. Core seat fee — 24 EUR per user per month. Covers hosting, infra, classic CRUD ops, normal monitoring, first-line support.
  2. Token bundle — every tenant gets 2M Engine tokens per month included. Roughly 12,000 chat messages or 600 document syntheses.
  3. Overage — past the bundle, 0.0028 EUR per 1k tokens. Soft warning at 80%, hard email at 100%. Past 120%, the tenant has to confirm they don't want tool calls blocked.

The overage is what protects the margin. Heavy users pay, light users don't overpay.

The 3 pricing experiments that failed

Per-message

Every chat message has a price. Logical, but users immediately started writing longer messages ("five things at once") which made the chat experience worse. A few stopped using it entirely because they got anxious about every click. Reverted after 2 weeks.

Per-engagement

Fixed amount per engagement. Maybe the worst version. Consultants would crowd every question into one 4-hour engagement per week. Context bloated, model cost rose, and we ended up giving the service for less than it cost. Reverted after a month.

Per-tenant flat — "unlimited"

"Pay X a month, do anything." Lasted 8 weeks. In 3 of those weeks two tenants blew through our monthly engine bill. One was outright loss-making. Reverted.

The renewal data

Q1 2026: 47 paying tenants. 44 renewed, 3 didn't. The 3 churns: one left consulting altogether (corporate restructuring), one moved to Pipedrive's PSA module (unfortunate, for us), one didn't pay and we blocked them. None of these is product dissatisfaction.

The heavy-user segment (top 10% token usage) had 0% churn. The light-user segment (bottom 30%) had 1 churn — 24 EUR core seat was too much for their actual usage. We're now working on a "viewer seat" tier (8 EUR, read-only, no AI assistant) for that segment.

NRR over 12 months: 94%. Hard number for a boutique-focused SaaS. Industry median is 105-115%, but those serve enterprise — where per-seat expansion is the main NRR engine. In boutique, 90-100% is realistic.

What we're not saying

This isn't the only model. Some markets (e.g. tooling with very predictable monthly load) still work with flat. Some (e.g. heavy seasonality, small transactions) work with per-message. For our B2B AI-tooling segment, the hybrid above is what landed.

The decisive factor isn't pricing cleverness. It's transparent communication. Every tenant sees in real time on the dashboard how much they've used and what the expected monthly bill is. Nobody gets surprised at invoice time. That matters more than any pricing trick.

What they ask for, what we say no to

6 tenants asked for an "unlimited tokens" plan. It simply doesn't exist. We know per model exactly what a heavy enterprise's 500k-tokens-a-day usage costs us: at least 14-17 EUR per day, or about 510 EUR per month. Wrapping that into a flat fee either eats margin or pushes the product upmarket. Our answer: you can prepay larger token bundles at a discount (12M tokens at 21 EUR/month, 60M tokens at 86 EUR/month). That's honest, and the heavy user gets the predictable monthly cost they wanted.

The model keeps evolving

In Q3 2026 we plan the viewer seat (8 EUR) tier and a team-bundle offer (10+ users, 18% off the core seat). Token purchases above 10M will be 6% cheaper. Neither change is dramatic alone, but the total ARR impact is significant.

The deeper lesson: in the AI era, pricing isn't a static document. It's a living system you measure monthly and reform quarterly.

Let's talk about your project

Tell us what you are building — we will figure out how to help.

Consulting OS pricing in 2026 — what we learned from four quarters of experiments — Nortinia Journal | Nortinia