From d13a29df0150ed78b85a8920ab2725de9793c710 Mon Sep 17 00:00:00 2001 From: Ovidiu U Date: Sun, 3 May 2026 08:39:26 +0100 Subject: [PATCH] docs: amend prediction rebuild spec with implementation defaults and changelog v3 Adds two sections to the spec: - Implementation defaults: pins the four open decisions settled before Phase 1 (naive baseline = zero-change, math = inline pure PHP, coefficients on the backtests row, BEIS retrain = manual CSV + cron) plus the namespace, scaler, and Pest conventions. - Changelog v3: records the verdict-via-rule-gates architecture (gates not multipliers), removal of weeks_since_duty_change as a feature, lower 62% backtest gate, structural leak detector promoted to primary. Captured here so a future session can resume implementation without re-deriving them. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../2026-05-01-prediction-rebuild-design.md | 48 +++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/docs/superpowers/specs/2026-05-01-prediction-rebuild-design.md b/docs/superpowers/specs/2026-05-01-prediction-rebuild-design.md index ba264e8..59026b9 100644 --- a/docs/superpowers/specs/2026-05-01-prediction-rebuild-design.md +++ b/docs/superpowers/specs/2026-05-01-prediction-rebuild-design.md @@ -598,6 +598,54 @@ encoded in the harness as assertions, not aspirations. --- +## Implementation defaults (resolved 2026-05-01) + +These were settled before Phase 1 started. Captured here so a future +session can implement without re-deriving them. + +| Question | Decision | +|---|---| +| Naive baseline definition (Phase 2) | **Zero-change**: predict `ΔULSP[t+1] = 0`. Matches Alquist/Kilian's no-change benchmark. | +| Math library | **Inline pure PHP.** Ridge on 435 × ~7 is trivial linear algebra (`solve (XᵀX + λI)β = Xᵀy`). New helper class `app/Services/Forecasting/LinearAlgebra.php`. No external dependency. | +| Trained-coefficient storage | **On the `backtests` row**, in a new `coefficients_json` column. Fits naturally next to the metrics; deleting a backtest deletes its artifact. No separate table. | +| BEIS retrain trigger (Phase 4) | **Manual CSV refresh + cron retrain.** No auto-scrape of gov.uk in v1. Operator drops the new CSV, runs `php artisan forecast:retrain`, scheduler picks it up. | + +### Code-layout defaults + +- All forecasting services live under `app/Services/Forecasting/`. The + namespace deliberately differs from the deprecated + `app/Services/Prediction/Signals/` to signal "this replaces the old stack". +- Single source of truth for feature values: `FeatureBuilder` (Phase 3). + Used identically by training, backtesting, and live forecasting. The + structural leak detector reads from `FeatureBuilder` and verifies every + feature's source date is strictly before the target Monday. +- Eloquent models for new tables (`Backtest`, `WeeklyForecast`, + `ForecastOutcome`, `LlmOverlay`, `VolatilityRegime`, `WatchedEvent`). + Project convention is Eloquent everywhere. +- Pest tests under `tests/Unit/Services/Forecasting/` and + `tests/Feature/Forecasting/`. TDD per project standard. +- `final` classes for all services and value objects. + +### Phase 1 scope (precise) + +- `backtests` migration (with `coefficients_json` column). +- `Backtest` Eloquent model + factory. +- `WeeklyForecastModel` interface (the contract harness consumes). +- `ForecastFeature` interface (lets harness query feature source dates). +- `FeatureSpec` value object (immutable, hashes to deterministic + `model_version` string for audit linking). +- `LeakDetector` service (per-feature source-date check). +- `BacktestRunner` service (orchestrates leak check → train → eval → + persist). Computes directional accuracy, MAE, calibration table. +- Pest tests for `LeakDetector`, `BacktestRunner`, `FeatureSpec`. +- A test-stub `WeeklyForecastModel` (constant zero) for harness tests + only. The real naive baseline is Phase 2; the real ridge is Phase 3. + +Phase 1 ships when: migrations applied, all new Pest tests pass, the +existing test suite still passes, `vendor/bin/pint --dirty` is clean. + +--- + ## Changelog (substantive design decisions) | When | Change | Why |