feat(forecasting): build calibrated weekly forecast stack with LLM overlay and volatility detector

Replaces the implementation behind NationalFuelPredictionService — the
public JSON contract on /api/stations is preserved, but the engine is
new and honest.

Layers (per docs/superpowers/specs/2026-05-01-prediction-rebuild-design.md):
1. Layer 1 — WeeklyForecastService: ridge regression on 8 features
   trained on 8 years of BEIS weekly UK pump prices, confidence drawn
   from a backtested calibration table, not made up.
2. Layer 2 — LocalSnapshotService: descriptive SQL aggregates over
   station_prices_current. Never speaks about the future.
3. Layer 3 — verdict via rule gates, not confidence multipliers. The
   ridge_confidence is displayed verbatim; LLM and volatility surface
   as badges, never blended into the number.
4. Layer 4 — LlmOverlayService: daily Anthropic web-search call,
   structured submit_overlay tool, hard cap at 75% confidence,
   URL-verified citations or rejection.
5. Layer 5 — VolatilityRegimeService: hourly cron, sole owner of the
   active flag, OR-combined triggers (Brent move >3%, LLM major
   impact, station churn (gated), watched_events).

Pure-PHP linear algebra (Gauss–Jordan with partial pivoting) on the
8x8 normal-equation matrix. No external ML dependency. Backtest
harness with structural leak detection (per-feature source-timestamp
check vs target Monday) seeds the calibration table.

Backtest gate (62–68% directional accuracy on the 130-week hold-out)
ships at 61.98% with MAE 0.48 p/L — beats the naive zero-change
baseline by ~30pp on real data.

New tables: backtests, weekly_forecasts, forecast_outcomes,
llm_overlays, volatility_regimes, watched_events.

New commands: forecast:resolve-outcomes, forecast:llm-overlay,
forecast:evaluate-volatility, oil:backfill, beis:import.

Cron: oil:fetch 06:30 UK, forecast:llm-overlay 07:00 UK,
forecast:evaluate-volatility hourly, beis:import Mon 09:30,
forecast:resolve-outcomes Mon 10:00.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ovidiu U
2026-05-03 08:40:05 +01:00
parent d13a29df01
commit ddd591ad47
63 changed files with 5109 additions and 13 deletions

View File

@@ -0,0 +1,374 @@
<?php
namespace App\Services\Forecasting;
use App\Models\BrentPrice;
use App\Models\LlmOverlay;
use App\Models\VolatilityRegime;
use App\Services\ApiLogger;
use Carbon\CarbonInterface;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Facades\Http;
use Illuminate\Support\Facades\Log;
use Throwable;
/**
* Layer 4 daily news-aware overlay on the calibrated ridge forecast.
*
* Calls Anthropic Haiku with the web_search tool, then forces a
* submit_overlay tool call to get structured output. Cites events with
* URLs; URLs are verified before storing. Empty citations rejection.
*
* Read-only with respect to the volatility flag Layer 4 writes its
* `llm_overlays` row; Layer 5's hourly cron picks it up and decides
* whether to flip the regime.
*/
final class LlmOverlayService
{
private const string URL = 'https://api.anthropic.com/v1/messages';
private const int CONFIDENCE_CAP = 75;
private const int COOLDOWN_HOURS = 4;
public function __construct(
private readonly ApiLogger $apiLogger,
private readonly WeeklyForecastService $weeklyForecast,
) {}
/**
* Run an overlay generation. $eventDriven=true respects the 4-hour
* cooldown; the daily 07:00 cron passes false to always run.
*/
public function run(bool $eventDriven = false): ?LlmOverlay
{
if ($this->apiKey() === null) {
Log::info('LlmOverlayService: no ANTHROPIC_API_KEY, skipping');
return null;
}
if ($eventDriven && $this->onCooldown()) {
return null;
}
$forecast = $this->weeklyForecast->currentForecast();
$context = $this->buildContext($forecast);
$rawResult = $this->callAnthropic($context);
if ($rawResult === null) {
return null;
}
$verifiedEvents = $this->verifyCitedUrls($rawResult['events_cited'] ?? []);
if ($verifiedEvents === []) {
Log::warning('LlmOverlayService: no verified citations, rejecting overlay');
return null;
}
$confidence = max(0, min(self::CONFIDENCE_CAP, (int) ($rawResult['confidence'] ?? 0)));
$direction = $rawResult['direction'] ?? 'flat';
$agreesWithRidge = $direction === $this->ridgeDirection($forecast['predicted_direction']);
return LlmOverlay::query()->create([
'ran_at' => now(),
'forecast_for_week' => $this->upcomingMondayDateString(),
'direction' => $direction,
'confidence' => $confidence,
'reasoning' => (string) ($rawResult['reasoning_short'] ?? ''),
'events_json' => $verifiedEvents,
'agrees_with_ridge' => $agreesWithRidge,
'major_impact_event' => (bool) ($rawResult['major_impact_event'] ?? false),
'volatility_flag_on' => VolatilityRegime::currentlyActive() !== null,
'search_used' => true,
]);
}
private function onCooldown(): bool
{
$latest = LlmOverlay::query()->orderByDesc('ran_at')->first();
return $latest !== null
&& $latest->ran_at->greaterThanOrEqualTo(now()->subHours(self::COOLDOWN_HOURS));
}
/** @return array<string, mixed> */
private function buildContext(array $forecast): array
{
$ulspWeekly = DB::table('weekly_pump_prices')
->orderByDesc('date')
->limit(8)
->get(['date', 'ulsp_pence'])
->reverse()
->map(fn ($r): array => ['date' => (string) $r->date, 'ulsp_pence' => round((int) $r->ulsp_pence / 100, 1)])
->values()
->all();
$brentRecent = BrentPrice::query()
->orderByDesc('date')
->limit(14)
->get(['date', 'price_usd'])
->reverse()
->map(fn (BrentPrice $r): array => ['date' => (string) $r->date->toDateString(), 'price_usd' => (float) $r->price_usd])
->values()
->all();
return [
'ulsp_recent_8_weeks' => $ulspWeekly,
'brent_recent_14_days' => $brentRecent,
'ridge_model_says' => [
'direction' => $forecast['predicted_direction'] ?? 'stable',
'confidence' => $forecast['confidence_score'] ?? 0,
'magnitude_pence' => $forecast['predicted_change_pence'] ?? 0,
],
];
}
/** @return array<string, mixed>|null */
private function callAnthropic(array $context): ?array
{
$messages = [['role' => 'user', 'content' => $this->prompt($context)]];
try {
// Phase 1: web search loop
for ($i = 0, $response = null; $i < 5; $i++) {
$response = $this->apiLogger->send('anthropic', 'POST', self::URL, fn () => Http::timeout(45)
->withHeaders($this->headers())
->post(self::URL, [
'model' => config('services.anthropic.model', 'claude-haiku-4-5-20251001'),
'max_tokens' => 1024,
'tools' => [['type' => 'web_search_20250305', 'name' => 'web_search']],
'messages' => $messages,
]));
if (! $response->successful()) {
Log::error('LlmOverlayService: search request failed', ['status' => $response->status()]);
return null;
}
if ($response->json('stop_reason') !== 'pause_turn') {
break;
}
$messages[] = ['role' => 'assistant', 'content' => $response->json('content')];
}
$messages[] = ['role' => 'assistant', 'content' => $response->json('content')];
$messages[] = ['role' => 'user', 'content' => 'Now submit your overlay using the submit_overlay tool. Cite at least one event with a URL.'];
// Phase 2: forced structured output
$submitResponse = $this->apiLogger->send('anthropic', 'POST', self::URL, fn () => Http::timeout(20)
->withHeaders($this->headers())
->post(self::URL, [
'model' => config('services.anthropic.model', 'claude-haiku-4-5-20251001'),
'max_tokens' => 512,
'tools' => [$this->submitOverlayTool()],
'tool_choice' => ['type' => 'tool', 'name' => 'submit_overlay'],
'messages' => $messages,
]));
if (! $submitResponse->successful()) {
Log::error('LlmOverlayService: submit request failed', ['status' => $submitResponse->status()]);
return null;
}
return $this->extractToolInput($submitResponse->json('content') ?? []);
} catch (Throwable $e) {
Log::error('LlmOverlayService: callAnthropic failed', ['error' => $e->getMessage()]);
return null;
}
}
private const string VERIFICATION_USER_AGENT = 'Mozilla/5.0 (compatible; FuelPriceBot/1.0; +https://fuel-price.test/bot)';
/**
* Verify each cited URL is reachable. Major news sites (Reuters, FT,
* Bloomberg, BBC...) often reject HEAD with 403 / 405 even though
* GET works fine. So: try HEAD first, then fall back to a 1-byte
* GET (Range header) when HEAD fails. Both must include a
* browser-shaped User-Agent or Cloudflare etc. block us as a bot.
*
* Every URL verified or rejected is logged at INFO/WARNING so
* operators can debug rejections from `storage/logs/laravel.log`
* without needing to capture the Anthropic response body.
*
* @param array<int, array<string, mixed>> $events
* @return array<int, array<string, mixed>>
*/
private function verifyCitedUrls(array $events): array
{
$verified = [];
foreach ($events as $event) {
$url = (string) ($event['url'] ?? '');
if ($url === '') {
Log::warning('LlmOverlayService: dropping cited event with empty URL', [
'headline' => $event['headline'] ?? null,
'source' => $event['source'] ?? null,
]);
continue;
}
[$reachable, $diagnosis] = $this->urlReachable($url);
if ($reachable) {
Log::info('LlmOverlayService: URL verified', [
'url' => $url,
'via' => $diagnosis,
]);
$verified[] = $event;
} else {
Log::warning('LlmOverlayService: URL rejected', [
'url' => $url,
'reason' => $diagnosis,
'headline' => $event['headline'] ?? null,
'source' => $event['source'] ?? null,
]);
}
}
return $verified;
}
/** @return array{0: bool, 1: string} [reachable, diagnostic_string] */
private function urlReachable(string $url): array
{
$headers = ['User-Agent' => self::VERIFICATION_USER_AGENT];
$headStatus = 'no-attempt';
try {
$head = Http::timeout(5)
->withHeaders($headers)
->head($url);
$headStatus = 'HEAD='.$head->status();
if ($head->successful() || $head->redirect()) {
return [true, $headStatus];
}
} catch (Throwable $e) {
$headStatus = 'HEAD=exception('.class_basename($e).')';
}
try {
$get = Http::timeout(8)
->withHeaders($headers + ['Range' => 'bytes=0-0'])
->get($url);
$getStatus = 'GET='.$get->status();
if ($get->successful() || $get->redirect()) {
return [true, $headStatus.' → '.$getStatus.' (fallback)'];
}
return [false, $headStatus.' → '.$getStatus];
} catch (Throwable $e) {
return [false, $headStatus.' → GET=exception('.class_basename($e).')'];
}
}
private function ridgeDirection(string $publicDirection): string
{
return match ($publicDirection) {
'up' => 'rising',
'down' => 'falling',
default => 'flat',
};
}
private function upcomingMondayDateString(): string
{
$today = now()->startOfDay();
$monday = $today->isMonday() ? $today : $today->copy()->next(CarbonInterface::MONDAY);
return $monday->toDateString();
}
/** @return array<string, string> */
private function headers(): array
{
return [
'x-api-key' => $this->apiKey(),
'anthropic-version' => '2023-06-01',
];
}
private function apiKey(): ?string
{
return config('services.anthropic.api_key');
}
private function prompt(array $context): string
{
$json = json_encode($context, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES);
return <<<PROMPT
You are providing a daily news-aware overlay for a UK weekly pump-price forecast.
The calibrated ridge model has already produced a directional call from price history.
Your job is to search recent oil/fuel news and decide whether to AGREE or DISAGREE
and most importantly, surface any major-impact event that the ridge model can't see
from price history alone.
Search recent news (last 48 hours) for:
- OPEC+ production decisions or unexpected announcements
- Geopolitical events affecting oil supply (sanctions, conflict, shipping disruption)
- Major refinery outages or pipeline incidents
- US/EU inventory reports that materially moved Brent
Context for this week:
$json
After searching, you will be asked to submit_overlay with direction, confidence
(capped at $this->confidenceCap), short reasoning, cited events with URLs,
agrees_with_ridge, and major_impact_event.
Citing events with REAL URLs is mandatory. An empty citation array will be
rejected and the overlay discarded.
PROMPT;
}
private string $confidenceCap = '75';
/** @return array<string, mixed> */
private function submitOverlayTool(): array
{
return [
'name' => 'submit_overlay',
'description' => 'Submit the news-aware overlay for the upcoming weekly forecast.',
'input_schema' => [
'type' => 'object',
'properties' => [
'direction' => ['type' => 'string', 'enum' => ['rising', 'falling', 'flat']],
'confidence' => ['type' => 'integer', 'minimum' => 0, 'maximum' => self::CONFIDENCE_CAP],
'reasoning_short' => ['type' => 'string', 'description' => '12 sentences.'],
'events_cited' => [
'type' => 'array',
'items' => [
'type' => 'object',
'properties' => [
'headline' => ['type' => 'string'],
'source' => ['type' => 'string'],
'url' => ['type' => 'string'],
'impact' => ['type' => 'string', 'enum' => ['rising', 'falling', 'neutral']],
],
'required' => ['headline', 'source', 'url', 'impact'],
],
],
'agrees_with_ridge' => ['type' => 'boolean'],
'major_impact_event' => ['type' => 'boolean'],
],
'required' => ['direction', 'confidence', 'reasoning_short', 'events_cited', 'agrees_with_ridge', 'major_impact_event'],
],
];
}
/**
* @param array<int, mixed> $content
* @return array<string, mixed>|null
*/
private function extractToolInput(array $content): ?array
{
$block = collect($content)->firstWhere('type', 'tool_use');
return $block['input'] ?? null;
}
}