DOCS

Документация

Cost Arbitrage

Cost Arbitrage

This document captures the cost evidence required for DoD D.

What is measured

Tier distribution across the offline harness sample set.
B6 comparison between MoA and direct execution.
Canonical usage and cost in integer micro-RUB.

Source of truth

Cost math: web/lib/billing.ts and crates/plyrum-billing-client/src/cost.rs.
MoA activation logic: web/lib/moa.ts.
Report output: target/cost_report.json.

Offline harness assumptions

The report uses explicit offline assumptions because live provider credentials are not available in this workspace:

provider calls are modeled from deterministic fixture traces, not live API responses;
token counts are fixed per scenario;
direct and MoA runs share the same pricing table and FX snapshot;
the B6 metric is a relative comparison of modeled micro-RUB cost and modeled output quality on the same harness inputs.

B6 metric

Scenario: напиши змейку 40x20 на Rust.
Direct baseline: openai/gpt-5.5.
MoA plan: 3x openai/gpt-5.4-mini proposer lanes, 1x openai/gpt-5.4 aggregator, deterministic finalizer.
Acceptance gate: moa_cost_micro_rub <= direct_cost_micro_rub * 0.60 and judge_score >= 0.85.
Current offline harness result: MoA is 60.00% of the direct modeled cost with judge_score = 0.87.

Validation commands

jq -e '.offline_harness.assumptions[]' target/cost_report.json
jq -e '.tier_distribution and .b6_moa_vs_direct' target/cost_report.json
jq -e '.b6_moa_vs_direct.moa_cost_micro_rub <= (.b6_moa_vs_direct.direct_cost_micro_rub * 0.6) and .b6_moa_vs_direct.judge_score >= 0.85' target/cost_report.json
cargo test -p plyrum-billing-client
pnpm -C web test

Notes

If live provider creds are later added, this report should be regenerated with live traces and the offline-harness flag should flip to false.