docs: amend prediction rebuild spec with implementation defaults and changelog v3

Adds two sections to the spec:
- Implementation defaults: pins the four open decisions settled before
  Phase 1 (naive baseline = zero-change, math = inline pure PHP,
  coefficients on the backtests row, BEIS retrain = manual CSV + cron)
  plus the namespace, scaler, and Pest conventions.
- Changelog v3: records the verdict-via-rule-gates architecture (gates
  not multipliers), removal of weeks_since_duty_change as a feature,
  lower 62% backtest gate, structural leak detector promoted to primary.

Captured here so a future session can resume implementation without
re-deriving them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ovidiu U
2026-05-03 08:39:26 +01:00
parent c2c237a1b3
commit d13a29df01

View File

@@ -598,6 +598,54 @@ encoded in the harness as assertions, not aspirations.
---
## Implementation defaults (resolved 2026-05-01)
These were settled before Phase 1 started. Captured here so a future
session can implement without re-deriving them.
| Question | Decision |
|---|---|
| Naive baseline definition (Phase 2) | **Zero-change**: predict `ΔULSP[t+1] = 0`. Matches Alquist/Kilian's no-change benchmark. |
| Math library | **Inline pure PHP.** Ridge on 435 × ~7 is trivial linear algebra (`solve (XᵀX + λI)β = Xᵀy`). New helper class `app/Services/Forecasting/LinearAlgebra.php`. No external dependency. |
| Trained-coefficient storage | **On the `backtests` row**, in a new `coefficients_json` column. Fits naturally next to the metrics; deleting a backtest deletes its artifact. No separate table. |
| BEIS retrain trigger (Phase 4) | **Manual CSV refresh + cron retrain.** No auto-scrape of gov.uk in v1. Operator drops the new CSV, runs `php artisan forecast:retrain`, scheduler picks it up. |
### Code-layout defaults
- All forecasting services live under `app/Services/Forecasting/`. The
namespace deliberately differs from the deprecated
`app/Services/Prediction/Signals/` to signal "this replaces the old stack".
- Single source of truth for feature values: `FeatureBuilder` (Phase 3).
Used identically by training, backtesting, and live forecasting. The
structural leak detector reads from `FeatureBuilder` and verifies every
feature's source date is strictly before the target Monday.
- Eloquent models for new tables (`Backtest`, `WeeklyForecast`,
`ForecastOutcome`, `LlmOverlay`, `VolatilityRegime`, `WatchedEvent`).
Project convention is Eloquent everywhere.
- Pest tests under `tests/Unit/Services/Forecasting/` and
`tests/Feature/Forecasting/`. TDD per project standard.
- `final` classes for all services and value objects.
### Phase 1 scope (precise)
- `backtests` migration (with `coefficients_json` column).
- `Backtest` Eloquent model + factory.
- `WeeklyForecastModel` interface (the contract harness consumes).
- `ForecastFeature` interface (lets harness query feature source dates).
- `FeatureSpec` value object (immutable, hashes to deterministic
`model_version` string for audit linking).
- `LeakDetector` service (per-feature source-date check).
- `BacktestRunner` service (orchestrates leak check → train → eval →
persist). Computes directional accuracy, MAE, calibration table.
- Pest tests for `LeakDetector`, `BacktestRunner`, `FeatureSpec`.
- A test-stub `WeeklyForecastModel` (constant zero) for harness tests
only. The real naive baseline is Phase 2; the real ridge is Phase 3.
Phase 1 ships when: migrations applied, all new Pest tests pass, the
existing test suite still passes, `vendor/bin/pint --dirty` is clean.
---
## Changelog (substantive design decisions)
| When | Change | Why |