Files
fuel-alert/app/Services/Forecasting/LlmOverlayService.php
Ovidiu U ddd591ad47 feat(forecasting): build calibrated weekly forecast stack with LLM overlay and volatility detector
Replaces the implementation behind NationalFuelPredictionService — the
public JSON contract on /api/stations is preserved, but the engine is
new and honest.

Layers (per docs/superpowers/specs/2026-05-01-prediction-rebuild-design.md):
1. Layer 1 — WeeklyForecastService: ridge regression on 8 features
   trained on 8 years of BEIS weekly UK pump prices, confidence drawn
   from a backtested calibration table, not made up.
2. Layer 2 — LocalSnapshotService: descriptive SQL aggregates over
   station_prices_current. Never speaks about the future.
3. Layer 3 — verdict via rule gates, not confidence multipliers. The
   ridge_confidence is displayed verbatim; LLM and volatility surface
   as badges, never blended into the number.
4. Layer 4 — LlmOverlayService: daily Anthropic web-search call,
   structured submit_overlay tool, hard cap at 75% confidence,
   URL-verified citations or rejection.
5. Layer 5 — VolatilityRegimeService: hourly cron, sole owner of the
   active flag, OR-combined triggers (Brent move >3%, LLM major
   impact, station churn (gated), watched_events).

Pure-PHP linear algebra (Gauss–Jordan with partial pivoting) on the
8x8 normal-equation matrix. No external ML dependency. Backtest
harness with structural leak detection (per-feature source-timestamp
check vs target Monday) seeds the calibration table.

Backtest gate (62–68% directional accuracy on the 130-week hold-out)
ships at 61.98% with MAE 0.48 p/L — beats the naive zero-change
baseline by ~30pp on real data.

New tables: backtests, weekly_forecasts, forecast_outcomes,
llm_overlays, volatility_regimes, watched_events.

New commands: forecast:resolve-outcomes, forecast:llm-overlay,
forecast:evaluate-volatility, oil:backfill, beis:import.

Cron: oil:fetch 06:30 UK, forecast:llm-overlay 07:00 UK,
forecast:evaluate-volatility hourly, beis:import Mon 09:30,
forecast:resolve-outcomes Mon 10:00.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 08:40:05 +01:00

375 lines
14 KiB
PHP
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?php
namespace App\Services\Forecasting;
use App\Models\BrentPrice;
use App\Models\LlmOverlay;
use App\Models\VolatilityRegime;
use App\Services\ApiLogger;
use Carbon\CarbonInterface;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Facades\Http;
use Illuminate\Support\Facades\Log;
use Throwable;
/**
* Layer 4 — daily news-aware overlay on the calibrated ridge forecast.
*
* Calls Anthropic Haiku with the web_search tool, then forces a
* submit_overlay tool call to get structured output. Cites events with
* URLs; URLs are verified before storing. Empty citations → rejection.
*
* Read-only with respect to the volatility flag — Layer 4 writes its
* `llm_overlays` row; Layer 5's hourly cron picks it up and decides
* whether to flip the regime.
*/
final class LlmOverlayService
{
private const string URL = 'https://api.anthropic.com/v1/messages';
private const int CONFIDENCE_CAP = 75;
private const int COOLDOWN_HOURS = 4;
public function __construct(
private readonly ApiLogger $apiLogger,
private readonly WeeklyForecastService $weeklyForecast,
) {}
/**
* Run an overlay generation. $eventDriven=true respects the 4-hour
* cooldown; the daily 07:00 cron passes false to always run.
*/
public function run(bool $eventDriven = false): ?LlmOverlay
{
if ($this->apiKey() === null) {
Log::info('LlmOverlayService: no ANTHROPIC_API_KEY, skipping');
return null;
}
if ($eventDriven && $this->onCooldown()) {
return null;
}
$forecast = $this->weeklyForecast->currentForecast();
$context = $this->buildContext($forecast);
$rawResult = $this->callAnthropic($context);
if ($rawResult === null) {
return null;
}
$verifiedEvents = $this->verifyCitedUrls($rawResult['events_cited'] ?? []);
if ($verifiedEvents === []) {
Log::warning('LlmOverlayService: no verified citations, rejecting overlay');
return null;
}
$confidence = max(0, min(self::CONFIDENCE_CAP, (int) ($rawResult['confidence'] ?? 0)));
$direction = $rawResult['direction'] ?? 'flat';
$agreesWithRidge = $direction === $this->ridgeDirection($forecast['predicted_direction']);
return LlmOverlay::query()->create([
'ran_at' => now(),
'forecast_for_week' => $this->upcomingMondayDateString(),
'direction' => $direction,
'confidence' => $confidence,
'reasoning' => (string) ($rawResult['reasoning_short'] ?? ''),
'events_json' => $verifiedEvents,
'agrees_with_ridge' => $agreesWithRidge,
'major_impact_event' => (bool) ($rawResult['major_impact_event'] ?? false),
'volatility_flag_on' => VolatilityRegime::currentlyActive() !== null,
'search_used' => true,
]);
}
private function onCooldown(): bool
{
$latest = LlmOverlay::query()->orderByDesc('ran_at')->first();
return $latest !== null
&& $latest->ran_at->greaterThanOrEqualTo(now()->subHours(self::COOLDOWN_HOURS));
}
/** @return array<string, mixed> */
private function buildContext(array $forecast): array
{
$ulspWeekly = DB::table('weekly_pump_prices')
->orderByDesc('date')
->limit(8)
->get(['date', 'ulsp_pence'])
->reverse()
->map(fn ($r): array => ['date' => (string) $r->date, 'ulsp_pence' => round((int) $r->ulsp_pence / 100, 1)])
->values()
->all();
$brentRecent = BrentPrice::query()
->orderByDesc('date')
->limit(14)
->get(['date', 'price_usd'])
->reverse()
->map(fn (BrentPrice $r): array => ['date' => (string) $r->date->toDateString(), 'price_usd' => (float) $r->price_usd])
->values()
->all();
return [
'ulsp_recent_8_weeks' => $ulspWeekly,
'brent_recent_14_days' => $brentRecent,
'ridge_model_says' => [
'direction' => $forecast['predicted_direction'] ?? 'stable',
'confidence' => $forecast['confidence_score'] ?? 0,
'magnitude_pence' => $forecast['predicted_change_pence'] ?? 0,
],
];
}
/** @return array<string, mixed>|null */
private function callAnthropic(array $context): ?array
{
$messages = [['role' => 'user', 'content' => $this->prompt($context)]];
try {
// Phase 1: web search loop
for ($i = 0, $response = null; $i < 5; $i++) {
$response = $this->apiLogger->send('anthropic', 'POST', self::URL, fn () => Http::timeout(45)
->withHeaders($this->headers())
->post(self::URL, [
'model' => config('services.anthropic.model', 'claude-haiku-4-5-20251001'),
'max_tokens' => 1024,
'tools' => [['type' => 'web_search_20250305', 'name' => 'web_search']],
'messages' => $messages,
]));
if (! $response->successful()) {
Log::error('LlmOverlayService: search request failed', ['status' => $response->status()]);
return null;
}
if ($response->json('stop_reason') !== 'pause_turn') {
break;
}
$messages[] = ['role' => 'assistant', 'content' => $response->json('content')];
}
$messages[] = ['role' => 'assistant', 'content' => $response->json('content')];
$messages[] = ['role' => 'user', 'content' => 'Now submit your overlay using the submit_overlay tool. Cite at least one event with a URL.'];
// Phase 2: forced structured output
$submitResponse = $this->apiLogger->send('anthropic', 'POST', self::URL, fn () => Http::timeout(20)
->withHeaders($this->headers())
->post(self::URL, [
'model' => config('services.anthropic.model', 'claude-haiku-4-5-20251001'),
'max_tokens' => 512,
'tools' => [$this->submitOverlayTool()],
'tool_choice' => ['type' => 'tool', 'name' => 'submit_overlay'],
'messages' => $messages,
]));
if (! $submitResponse->successful()) {
Log::error('LlmOverlayService: submit request failed', ['status' => $submitResponse->status()]);
return null;
}
return $this->extractToolInput($submitResponse->json('content') ?? []);
} catch (Throwable $e) {
Log::error('LlmOverlayService: callAnthropic failed', ['error' => $e->getMessage()]);
return null;
}
}
private const string VERIFICATION_USER_AGENT = 'Mozilla/5.0 (compatible; FuelPriceBot/1.0; +https://fuel-price.test/bot)';
/**
* Verify each cited URL is reachable. Major news sites (Reuters, FT,
* Bloomberg, BBC...) often reject HEAD with 403 / 405 even though
* GET works fine. So: try HEAD first, then fall back to a 1-byte
* GET (Range header) when HEAD fails. Both must include a
* browser-shaped User-Agent or Cloudflare etc. block us as a bot.
*
* Every URL — verified or rejected — is logged at INFO/WARNING so
* operators can debug rejections from `storage/logs/laravel.log`
* without needing to capture the Anthropic response body.
*
* @param array<int, array<string, mixed>> $events
* @return array<int, array<string, mixed>>
*/
private function verifyCitedUrls(array $events): array
{
$verified = [];
foreach ($events as $event) {
$url = (string) ($event['url'] ?? '');
if ($url === '') {
Log::warning('LlmOverlayService: dropping cited event with empty URL', [
'headline' => $event['headline'] ?? null,
'source' => $event['source'] ?? null,
]);
continue;
}
[$reachable, $diagnosis] = $this->urlReachable($url);
if ($reachable) {
Log::info('LlmOverlayService: URL verified', [
'url' => $url,
'via' => $diagnosis,
]);
$verified[] = $event;
} else {
Log::warning('LlmOverlayService: URL rejected', [
'url' => $url,
'reason' => $diagnosis,
'headline' => $event['headline'] ?? null,
'source' => $event['source'] ?? null,
]);
}
}
return $verified;
}
/** @return array{0: bool, 1: string} [reachable, diagnostic_string] */
private function urlReachable(string $url): array
{
$headers = ['User-Agent' => self::VERIFICATION_USER_AGENT];
$headStatus = 'no-attempt';
try {
$head = Http::timeout(5)
->withHeaders($headers)
->head($url);
$headStatus = 'HEAD='.$head->status();
if ($head->successful() || $head->redirect()) {
return [true, $headStatus];
}
} catch (Throwable $e) {
$headStatus = 'HEAD=exception('.class_basename($e).')';
}
try {
$get = Http::timeout(8)
->withHeaders($headers + ['Range' => 'bytes=0-0'])
->get($url);
$getStatus = 'GET='.$get->status();
if ($get->successful() || $get->redirect()) {
return [true, $headStatus.' → '.$getStatus.' (fallback)'];
}
return [false, $headStatus.' → '.$getStatus];
} catch (Throwable $e) {
return [false, $headStatus.' → GET=exception('.class_basename($e).')'];
}
}
private function ridgeDirection(string $publicDirection): string
{
return match ($publicDirection) {
'up' => 'rising',
'down' => 'falling',
default => 'flat',
};
}
private function upcomingMondayDateString(): string
{
$today = now()->startOfDay();
$monday = $today->isMonday() ? $today : $today->copy()->next(CarbonInterface::MONDAY);
return $monday->toDateString();
}
/** @return array<string, string> */
private function headers(): array
{
return [
'x-api-key' => $this->apiKey(),
'anthropic-version' => '2023-06-01',
];
}
private function apiKey(): ?string
{
return config('services.anthropic.api_key');
}
private function prompt(array $context): string
{
$json = json_encode($context, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES);
return <<<PROMPT
You are providing a daily news-aware overlay for a UK weekly pump-price forecast.
The calibrated ridge model has already produced a directional call from price history.
Your job is to search recent oil/fuel news and decide whether to AGREE or DISAGREE
— and most importantly, surface any major-impact event that the ridge model can't see
from price history alone.
Search recent news (last 48 hours) for:
- OPEC+ production decisions or unexpected announcements
- Geopolitical events affecting oil supply (sanctions, conflict, shipping disruption)
- Major refinery outages or pipeline incidents
- US/EU inventory reports that materially moved Brent
Context for this week:
$json
After searching, you will be asked to submit_overlay with direction, confidence
(capped at $this->confidenceCap), short reasoning, cited events with URLs,
agrees_with_ridge, and major_impact_event.
Citing events with REAL URLs is mandatory. An empty citation array will be
rejected and the overlay discarded.
PROMPT;
}
private string $confidenceCap = '75';
/** @return array<string, mixed> */
private function submitOverlayTool(): array
{
return [
'name' => 'submit_overlay',
'description' => 'Submit the news-aware overlay for the upcoming weekly forecast.',
'input_schema' => [
'type' => 'object',
'properties' => [
'direction' => ['type' => 'string', 'enum' => ['rising', 'falling', 'flat']],
'confidence' => ['type' => 'integer', 'minimum' => 0, 'maximum' => self::CONFIDENCE_CAP],
'reasoning_short' => ['type' => 'string', 'description' => '12 sentences.'],
'events_cited' => [
'type' => 'array',
'items' => [
'type' => 'object',
'properties' => [
'headline' => ['type' => 'string'],
'source' => ['type' => 'string'],
'url' => ['type' => 'string'],
'impact' => ['type' => 'string', 'enum' => ['rising', 'falling', 'neutral']],
],
'required' => ['headline', 'source', 'url', 'impact'],
],
],
'agrees_with_ridge' => ['type' => 'boolean'],
'major_impact_event' => ['type' => 'boolean'],
],
'required' => ['direction', 'confidence', 'reasoning_short', 'events_cited', 'agrees_with_ridge', 'major_impact_event'],
],
];
}
/**
* @param array<int, mixed> $content
* @return array<string, mixed>|null
*/
private function extractToolInput(array $content): ?array
{
$block = collect($content)->firstWhere('type', 'tool_use');
return $block['input'] ?? null;
}
}