feat: add LLM prediction providers with structured output support

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 14:42:44 +01:00
parent e9612666e3
commit 6a80c11f38
18 changed files with 1101 additions and 484 deletions
--- a/docs/api-reference.md
+++ b/docs/api-reference.md
@@ -152,18 +152,17 @@ GET /api/stats/searches?period=month

 ### GET `/api/prediction`

-National (or regional) fuel price direction forecast for the next 7 days, based on live price data signals.
+National or regional E10 fuel price direction forecast for the next 7 days, based on live price data signals. Always analyses E10 — the most widely available fuel and the one with the most price history.

 | Parameter | Type | Description |
 |---|---|---|
-| `fuel_type` | string | **Required.** Same aliases as `/api/stations` |
-| `lat` | float | Optional. Enables regional momentum signal |
-| `lng` | float | Optional. Enables regional momentum signal |
+| `lat` | float | Optional. Decimal latitude. Enables regional prediction (50km radius). |
+| `lng` | float | Optional. Decimal longitude. Required if `lat` is provided. |

 **Example request:**
 ```
-GET /api/prediction?fuel_type=diesel
-GET /api/prediction?fuel_type=petrol&lat=51.5074&lng=-0.1278
+GET /api/prediction
+GET /api/prediction?lat=51.5074&lng=-0.1278
 ```

 **Response:**
@@ -235,6 +234,13 @@ GET /api/prediction?fuel_type=petrol&lat=51.5074&lng=-0.1278
 }
 ```

+**`region_key` values:**
+
+| Value | Meaning |
+|---|---|
+| `"national"` | No coordinates provided. `current_avg` and signals use national data. `regional_momentum` is disabled. |
+| `"regional"` | Coordinates provided. `current_avg` uses stations within 50km. `regional_momentum` is the primary signal (50% weight). Falls back to national average if no stations found in radius. |
+
 **Key fields:**

 | Field | Values | Meaning |
@@ -243,7 +249,21 @@ GET /api/prediction?fuel_type=petrol&lat=51.5074&lng=-0.1278
 | `action` | `"fill_now"`, `"wait"`, `"no_signal"` | Consumer-facing recommendation |
 | `confidence_label` | `"high"` (≥70), `"medium"` (≥40), `"low"` (<40) | Signal strength |
 | `predicted_change_pence` | float | Expected p/litre change over 7 days |
-| `current_avg` | float | Current national average in pence (e.g. `143.9` = 143.9p) |
+| `current_avg` | float | Average price in pence (e.g. `143.9` = 143.9p). Regional if lat/lng given, else national. |
+
+**Signal weights:**
+
+| Scope | Signal | Weight |
+|---|---|---|
+| National | trend | 45% |
+| National | brand_behaviour | 25% |
+| National | day_of_week | 20% |
+| National | price_stickiness | 10% |
+| Regional | regional_momentum | 50% |
+| Regional | trend | 20% |
+| Regional | day_of_week | 15% |
+| Regional | brand_behaviour | 10% |
+| Regional | price_stickiness | 5% |

 **Signal structure** (each signal in `signals`):

@@ -254,12 +274,9 @@ GET /api/prediction?fuel_type=petrol&lat=51.5074&lng=-0.1278
 | `direction` | `"up"` / `"down"` / `"stable"` | Signal direction |
 | `detail` | string | Human-readable explanation |
 | `data_points` | int | Number of price records used |
-| `enabled` | bool | False if signal was skipped (missing data/coords) |
+| `enabled` | bool | False if signal was skipped (missing data or coordinates) |

-**Error — unknown fuel type:**
-```json
-{ "errors": { "fuel_type": ["Unknown fuel type. Use: diesel, petrol, e10, e5, hvo, b10."] } }
-```
+**LLM-backed prediction** — separately, the nightly `oil:predict` command generates an oil price direction from Brent crude data and stores it in `price_predictions`. This feeds into `AlertScoringService` (Signal 4) but is not exposed directly through this endpoint. See [LLM Prediction Providers](llm-prediction-providers.md).

 ---

--- a/docs/llm-prediction-providers.md
+++ b/docs/llm-prediction-providers.md
@@ -0,0 +1,145 @@
+# LLM Prediction Providers
+
+The oil price direction prediction supports multiple LLM backends behind a shared interface. The active provider is selected via environment variable. All providers return the same response shape and fall back to EWMA if not configured or if the API call fails.
+
+## Selecting a Provider
+
+Set `LLM_PREDICTION_PROVIDER` in `.env`:
+
+```
+LLM_PREDICTION_PROVIDER=anthropic   # default
+LLM_PREDICTION_PROVIDER=openai
+LLM_PREDICTION_PROVIDER=gemini
+```
+
+Each provider needs its own API key. If the key is missing or empty the provider returns `null` and EWMA is used instead.
+
+---
+
+## Providers
+
+### Anthropic (default)
+
+**Key:** `ANTHROPIC_API_KEY`  
+**Model:** `ANTHROPIC_MODEL` (default: `claude-sonnet-4-6`)
+
+Uses **tool use** with a forced `submit_prediction` tool call — no JSON parsing, guaranteed schema. Structured output is enforced at the API level via `tool_choice: { type: "tool", name: "submit_prediction" }`.
+
+Two-phase prediction flow:
+
+1. **Context phase** — multi-turn web search (`web_search_20250305` tool) for recent oil/geopolitical news (up to 5 iterations, `pause_turn` loop)
+2. **Submission phase** — once searches are complete, forces a `submit_prediction` tool call with the full conversation context
+
+If the context phase fails, falls back to a single-turn basic prediction (tool use only, no web search).
+
+```php
+// Structured output schema (enforced by Anthropic)
+'input_schema' => [
+    'type' => 'object',
+    'properties' => [
+        'direction'  => ['type' => 'string', 'enum' => ['rising', 'falling', 'flat']],
+        'confidence' => ['type' => 'integer', 'minimum' => 0, 'maximum' => 85],
+        'reasoning'  => ['type' => 'string'],
+    ],
+    'required' => ['direction', 'confidence', 'reasoning'],
+],
+```
+
+`PredictionSource`: `llm_with_context` (web search succeeded) or `llm` (basic fallback).
+
+---
+
+### OpenAI
+
+**Key:** `OPENAI_API_KEY`  
+**Model:** `OPENAI_MODEL` (default: `gpt-4o-mini`)
+
+Uses `response_format: json_schema` with `strict: true`. The schema is sent to the API and the response is guaranteed to match it.
+
+```php
+'response_format' => [
+    'type' => 'json_schema',
+    'json_schema' => [
+        'name'   => 'oil_prediction',
+        'strict' => true,
+        'schema' => [
+            'type' => 'object',
+            'properties' => [
+                'direction'  => ['type' => 'string', 'enum' => ['rising', 'falling', 'flat']],
+                'confidence' => ['type' => 'integer'],
+                'reasoning'  => ['type' => 'string'],
+            ],
+            'required'             => ['direction', 'confidence', 'reasoning'],
+            'additionalProperties' => false,
+        ],
+    ],
+],
+```
+
+Response is extracted from `choices.0.message.content` (a JSON string) and decoded.
+
+`PredictionSource`: `llm`
+
+---
+
+### Gemini
+
+**Key:** `GEMINI_API_KEY`  
+**Model:** `GEMINI_MODEL` (default: `gemini-2.0-flash`)
+
+Uses `responseMimeType: application/json` and `responseSchema` in `generationConfig`. The API key is passed as a query parameter.
+
+```php
+'generationConfig' => [
+    'responseMimeType' => 'application/json',
+    'responseSchema'   => [
+        'type' => 'OBJECT',
+        'properties' => [
+            'direction'  => ['type' => 'STRING', 'enum' => ['rising', 'falling', 'flat']],
+            'confidence' => ['type' => 'INTEGER'],
+            'reasoning'  => ['type' => 'STRING'],
+        ],
+        'required' => ['direction', 'confidence', 'reasoning'],
+    ],
+],
+```
+
+Response is extracted from `candidates.0.content.parts.0.text` (a JSON string) and decoded.
+
+`PredictionSource`: `llm`
+
+---
+
+## Confidence Caps
+
+All providers cap confidence at **85** regardless of what the model returns. EWMA is capped at **65**.
+
+---
+
+## EWMA Fallback
+
+`OilPriceService::generatePrediction()` always runs EWMA first and stores its result. The LLM provider runs after; its result is stored and returned if non-null. If the provider returns null (key missing, API error, malformed response), EWMA is returned instead.
+
+```
+generatePrediction()
+├── generateEwmaPrediction()  → always stored
+└── provider->predict()
+    ├── on success            → stored and returned (LLM wins)
+    └── on null               → EWMA returned
+```
+
+---
+
+## Adding a New Provider
+
+1. Create `app/Services/LlmPrediction/YourProvider.php` implementing `OilPredictionProvider`
+2. Add a case to the `match` in `AppServiceProvider::register()`
+3. Add key/model config to `config/services.php` and document the `.env` vars
+
+The interface requires one method:
+
+```php
+public function predict(Collection $prices): ?PricePrediction;
+```
+
+Return `null` on any failure — the orchestrator handles the fallback.
--- a/docs/superpowers/specs/2026-04-07-mobile-landing-fuelfinder-design.md
+++ b/docs/superpowers/specs/2026-04-07-mobile-landing-fuelfinder-design.md
@@ -171,7 +171,6 @@ Response shape: `{ data: Station[], meta: { count, lowest_pence, avg_pence } }`
 ### `/api/prediction`
 | Param | Type | Notes |
 |---|---|---|
-| `fuel_type` | string | e.g. `petrol`, `diesel` |
 | `lat` | float? | Optional — falls back to national |
 | `lng` | float? | Optional |