Files
wordsearch/wordsearch-specs.md
2026-05-04 09:45:17 +01:00

13 KiB
Raw Blame History

Word Search Puzzle Generator — Spec

Overview

A self-hosted web app that generates printable word search puzzles for kids (and grownups). Themed word lists are managed through the UI and stored as JSON files on disk. Every puzzle is configured explicitly per generation — no sticky difficulty presets.

Tech Stack

  • Python 3.14+
  • FastAPI — web framework
  • Jinja2 — server-rendered HTML templates (no JS framework, no build step)
  • reportlab — PDF generation
  • uvicorn — ASGI server
  • Vanilla JS — minimal, only for the theme editor (textarea + fetch)
  • Storage: JSON files on disk under themes/. No database.

Single language, single process, no build pipeline, no DB.

Deployment

  • Dockerfile + docker-compose.yml
  • Single container, single port (default 8000)
  • Mount ./themes as a volume
  • Behind Pangolin/Traefik if exposing on a subdomain; otherwise hit the LXC IP

Directory Layout

wordsearch/
├── app/
│   ├── main.py            # FastAPI routes
│   ├── generator.py       # Grid building + word placement
│   ├── normaliser.py      # Word normalisation + prefix stripping
│   ├── pdf.py             # PDF rendering (reportlab)
│   ├── themes.py          # Load / save / list theme JSON files
│   └── templates/
│       ├── base.html
│       ├── index.html     # Generate form
│       ├── themes.html    # Theme list
│       └── theme_edit.html
├── themes/                # JSON theme files (mounted volume)
├── static/style.css
├── requirements.txt
├── Dockerfile
└── docker-compose.yml

Web UI

/ — Generate Puzzle

Theme:        [ Mr Men Characters    ▾ ]

Grid size:    [ 12 ]    (5  25)
Words:        [ 10 ]    (1  30)
Min length:   [ 3 ]
Max length:   [ 12 ]    (clamped to grid size)

Title:        [                        ]   (optional, overrides theme name)

Directions:
  ☑ Horizontal (→)        — always on, locked
  ☑ Vertical (↓)          — always on, locked
  ☐ Diagonal (↘ ↗)
  ☐ Reversed (← ↑ ↖ ↙)

☐ Allow overlapping words

[ Generate ]
  • All optional toggles default off on every page load — no sticky state.
  • "Generate" → POSTs the form, streams the PDF straight back as a download (Content-Disposition: attachment; filename="<slug>_<timestamp>.pdf").
  • The server keeps no copy on disk; the browser is the only place the PDF lives.

/themes — Theme List

  • Table of existing themes: name, slug, word count, edit/delete buttons
  • "New theme" button → /themes/new

/themes/new and /themes/{slug}/edit — Theme Editor

Theme name:   [                        ]   (display name, e.g. "Sea Creatures")
Slug:         [                        ]   (filename; auto-generated on create, locked on edit)

Words (one per line):
┌────────────────────────────────────────┐
│ Mr Tickle                              │
│ Mr Happy                               │
│ Little Miss Sunshine                   │
│ ...                                    │
└────────────────────────────────────────┘

Live preview:
  Mr Tickle              → TICKLE
  Mr Happy               → HAPPY
  Little Miss Sunshine   → LITTLEMISSSUNSHINE

[ Save ]   [ Delete ]
  • The live preview shows what each word will look like in the grid after normalisation. Updates on textarea change (debounced).
  • Save writes themes/<slug>.json. Delete confirms then removes.

Auth

None. Homelab use. Add HTTP basic auth via FastAPI middleware if exposed publicly later.

Routes

Method Path Purpose
GET / Generate form
POST /generate Build puzzle, return PDF download
GET /themes List themes
GET /themes/new New theme form
GET /themes/{slug}/edit Edit theme form
POST /themes Create theme
POST /themes/{slug} Update theme
POST /themes/{slug}/delete Delete theme
GET /api/themes JSON list of themes (dropdown source)
POST /api/normalise Preview normalisation for a list of
words (used by the editor live preview)

Word Normalisation

For every input word, the generator produces two forms:

  • Display form — original string, untouched. Goes on the PDF word list.
  • Grid form — what gets placed in the grid.

Grid form rules

  1. Strip any leading prefix tokens (case-insensitive, with or without trailing dot). Stripping is token-based, not substring — "Misty" does not match "Miss".
  2. Uppercase the result.
  3. Strip all whitespace and punctuation.

Stripped prefixes

A constant in normaliser.py:

PREFIXES = {
    "mr", "mrs", "ms", "miss",
    "dr", "sir", "dame", "lord", "lady", "master",
    "captain", "capt", "cpt",
    "professor", "prof",
    "saint", "st",
}

Examples

Input Display form Grid form
Mr Tickle Mr Tickle TICKLE
Mr. Bump Mr. Bump BUMP
Little Miss Sunshine Little Miss Sunshine LITTLEMISSSUNSHINE
Dr Octopus Dr Octopus OCTOPUS
Sir Lancelot Sir Lancelot LANCELOT
Misty Misty MISTY
cucumber cucumber CUCUMBER
Captain America Captain America AMERICA

Prefix-only edge cases

  • Multiple consecutive prefixes get stripped: "Mr Dr Strange"STRANGE.
  • Word that is only a prefix: "Mr" → keep as MR, log a warning to stderr (probably a theme typo).
  • After stripping, if grid form is empty, skip the word and warn.

Word Selection

  1. Load theme word list.
  2. Compute grid form for each word.
  3. Filter by length: min_length ≤ len(grid_form) ≤ min(max_length, grid_size).
  4. Shuffle.
  5. Place words one by one until either:
    • the requested word count is reached, or
    • the filtered list is exhausted.
  6. If fewer words placed than requested, generate the puzzle anyway and log a stderr warning — there is no result page, so warnings aren't surfaced to the user; they have to verify the dropdown's word count matches their target before generating.

Length filter validation

If the filter matches fewer words than requested, generate with what's available and warn (e.g. "asked for 10 words; only 4 in the theme matched your length filter"). Don't block — the user might genuinely want a sparse puzzle.

Word Placement Rules

Direction set

The active directions are computed from the toggles:

base = { → , ↓ }                              # always on

if diagonal_toggle:
    base |= { ↘ , ↗ }

if reversed_toggle:
    base |= { reverse(d) for d in base }      # add reversal of everything in base

So:

Diag Rev Active directions
→ ↓
→ ↓ ↘ ↗
→ ↓ ← ↑
→ ↓ ↘ ↗ ← ↑ ↖ ↙ (all 8)

Direction vectors (Δrow, Δcol):

Symbol Δrow Δcol
0 +1
+1 0
+1 +1
1 +1
0 1
1 0
1 1
+1 1

Placement algorithm

For each word:

  1. Pick a random direction from the active set.
  2. Pick a random valid starting cell such that the entire word fits within the grid (compute bounds from word length + direction vector).
  3. Check collision against already-placed words:
    • If overlap toggle is off: every cell the word would occupy must be currently empty.
    • If overlap toggle is on: every cell must either be empty OR contain the same letter the word would place there (letter-sharing intersections).
  4. If the placement is valid, commit it. Otherwise retry up to 200 attempts per word (re-rolling direction + start each time), then skip with a warning.

Grid fill

After all words are placed, fill remaining empty cells with random uppercase AZ letters. All letters are uppercase.

Theme File Format

themes/<slug>.json:

{
  "name": "Mr Men Characters",
  "words": [
    "Mr Tickle",
    "Mr Happy",
    "Mr Bump",
    "Little Miss Sunshine",
    "Mr Strong",
    "Mr Tall",
    "Mr Small"
  ]
}
  • name — human-readable, shown in dropdown and on PDF.
  • words — list of strings, one per line in the editor textarea.

Theme curation guidance (in README)

  • Aim for 2030 words per theme with a healthy spread of lengths (some short, some long) so length filters give useful results across age groups.
  • Avoid hyphens and apostrophes when possible — they get stripped. "Spider-Man"SPIDERMAN is fine, but "don't"DONT may surprise.

PDF Layout

A4 portrait, single page, plain.

Top

  • Title (theme name or override), centred, large.

Middle: Grid

  • Centred, monospace font, generous letter spacing.
  • No cell borders (cleaner look, easier to scan).
  • Sized so a 12×12 grid is comfortable to read; scales down for larger grids.

Bottom: Word list

  • Heading: "Find these words"
  • 24 columns depending on word count.
  • Each entry rendered as:
    • GRIDFORM (bold, uppercase) followed by (original prefix tokens) in lighter weight — only if the word had a stripped prefix.
    • Words with no prefix render bare: just GRIDFORM in bold uppercase.
  • Examples in the rendered list:
    TICKLE (Mr)        BUMP (Mr)
    HAPPY (Mr)         STRONG (Mr)
    SUNSHINE (Little Miss)
    CUCUMBER           TOMATO
    AMERICA (Captain)
    

Active options subtitle

Directly under the title, in small uppercase letter-spaced grey text, list any extra directions or modes that are enabled beyond the always-on horizontal + vertical baseline. The labels render only when at least one of diagonal, reversed, or allow_overlap is on:

DIAGONAL  ·  REVERSED  ·  OVERLAPPING

If none of them are enabled, render nothing — no empty line, no spacing.

Page stays clean: no timestamp, no branding, no toggle state in the footer. (The download filename carries the timestamp.)

No answer key

v1 ships puzzle-only. Solution PDF is a future enhancement.

Initial Themes to Ship

Pre-populate themes/ with starter files (2030 words each, varied length):

  • mr-men.json
  • sea-creatures.json
  • superheroes.json
  • farm-animals.json
  • villains.json
  • transformers.json
  • wild-animals.json
  • precious-stones.json
  • common-birds.json — songbirds plus raptors (owl, eagle, hawk, falcon, etc.)
  • science-physics.json — forces, energy, motion, electricity
  • science-chemistry.json — atoms, molecules, elements, reactions
  • science-biology.json — cells, organs, microbes, ecology

Dockerfile

FROM python:3.14-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY app/ ./app/
COPY static/ ./static/
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

docker-compose.yml

services:
  wordsearch:
    build: .
    ports:
      - "8000:8000"
    volumes:
      - ./themes:/app/themes
    restart: unless-stopped

Acceptance Criteria

  • docker compose up starts the app, accessible at http://<host>:8000.
  • Generate a 12×12 puzzle from a pre-shipped theme with default settings, download a valid PDF.
  • Toggling diagonal, reversed, and overlap each visibly changes the puzzle.
  • Min/max length filtering works: setting min=8 excludes short words.
  • Theme editor: create new theme, see live normalisation preview, save, appear in dropdown, generate from it.
  • Edit and delete existing themes via UI.
  • Words with prefixes (Mr, Dr, etc.) show stripped form in grid, full form with prefix in parentheses on word list.
  • When fewer words can be placed than requested, the PDF still generates (warnings only go to stderr — no result page).
  • Bad input (invalid slug, empty word list, max length < min length) shows a clear error message, not a stack trace.
  • Downloaded PDFs are named <slug>_<YYYY-MM-DD_HH-MM-SS>.pdf.