Firehose Weekly fuel for the dev firehose

Ssg_implementation_checklist

SSGv3 Implementation Checklist - Test-First Approach

Testing Strategy

  • Unit tests: Test each component in isolation with mocks
  • Integration tests: Test stage interactions with real file fixtures
  • E2E tests: Full pipeline runs on sample sites
  • Property tests: Use hypothesis for cache key determinism, URL normalization idempotence

Test Data Principles

  • Use realistic content (actual markdown with frontmatter)
  • Cover boundary conditions (empty files, missing fields)
  • Include unicode in slugs and content
  • Test with various date formats and timezones
  • Include invalid inputs to verify error handling

Phase 1: Foundation - Data Structures & Models

Core Models

  • Write test: BuildItem creation with minimal fields
  • Write test: BuildItem state transitions (SCANNED → BUILT → WRITTEN)
  • Write test: BuildItem.to_dict() serialization
  • Write test: BuildItem.from_dict() deserialization
  • Implement BuildItem base class
  • Write test: ContentItem with templates_used list
  • Write test: AssetItem with preserve_mtime flag
  • Write test: IndexItem with pagination fields
  • Implement ContentItem, AssetItem, IndexItem subclasses

Error Models

  • Write test: SSGError with message and suggestion
  • Write test: BuildError with code, src, message
  • Write test: CollisionError with url and multiple sources
  • Write test: TemplateError for missing template
  • Implement error class hierarchy
  • Write test: ErrorCollector accumulates multiple errors
  • Write test: ErrorCollector.format_report() output
  • Implement ErrorCollector class

Configuration

  • Write test: Config loads from TOML with all fields
  • Write test: Config validates content_dir exists
  • Write test: Config validates output_dir not inside content_dir
  • Write test: Config validates permalink template syntax
  • Write test: Config blacklist includes default patterns
  • Implement Config dataclass with validation
  • Write test: Config detects invalid page_size (< 1)
  • Write test: Config normalizes paths to absolute

Phase 2: Utilities Layer

File Utilities

  • Write test: FileUtils.read_file() handles UTF-8
  • Write test: FileUtils.read_file() raises on invalid encoding
  • Write test: FileUtils.stat_file() returns mtime and size
  • Write test: FileUtils.is_ignored() matches patterns
  • Implement FileUtils class
  • Write test: FileUtils handles symlinks (don’t follow by default)

Date Parsing

  • Write test: DateParser.parse() handles ISO format
  • Write test: DateParser.parse() handles “YYYY-MM-DD” format
  • Write test: DateParser.parse() handles human formats (“Jan 15, 2025”)
  • Write test: DateParser.to_iso() produces standard format
  • Write test: DateParser.extract_from_filename() finds “YYYY-MM-DD-title.md”
  • Implement DateParser class
  • Write test: DateParser handles timezones consistently

Path Resolution

  • Write test: PathResolver.resolve() makes paths absolute
  • Write test: PathResolver.make_relative() produces relative path
  • Write test: PathResolver.is_within() detects containment
  • Write test: PathResolver.is_within() rejects traversal attempts
  • Implement PathResolver class

Slug Generation

  • Write test: SlugGenerator.generate() lowercases input
  • Write test: SlugGenerator.generate() transliterates unicode (ö→o, é→e)
  • Write test: SlugGenerator.generate() removes punctuation except hyphens
  • Write test: SlugGenerator.generate() collapses whitespace to single hyphen
  • Write test: SlugGenerator.generate() strips leading/trailing hyphens
  • Implement SlugGenerator class
  • Write test: SlugGenerator with preserve_unicode=True keeps diacritics

Phase 3: Metadata System

Metadata Extraction

  • Write test: MetadataExtractor.parse_frontmatter() extracts YAML
  • Write test: MetadataExtractor.parse_frontmatter() handles empty frontmatter
  • Write test: MetadataExtractor.parse_frontmatter() raises on invalid YAML
  • Write test: MetadataExtractor.get_system_defaults() uses filename for slug
  • Write test: MetadataExtractor.get_system_defaults() uses mtime for date
  • Write test: MetadataExtractor.get_path_derived() extracts category from parent
  • Write test: MetadataExtractor.get_path_derived() handles top-level files
  • Implement MetadataExtractor class
  • Write test: MetadataExtractor.merge_metadata() respects precedence
  • Write test: MetadataExtractor.extract() combines all three sources
  • Write test: Frontmatter category overrides directory structure

Phase 4: Template System

Template Dependency Tracking

  • Write test: TemplateDependencyTracker.build_graph() finds includes
  • Write test: TemplateDependencyTracker.build_graph() parses Jinja2
  • Write test: TemplateDependencyTracker.detect_cycles() finds A→B→A
  • Write test: TemplateDependencyTracker.detect_cycles() handles self-reference
  • Write test: TemplateDependencyTracker.detect_cycles() returns empty for DAG
  • Implement TemplateDependencyTracker class
  • Write test: TemplateDependencyTracker.compute_hash() includes content
  • Write test: TemplateDependencyTracker.compute_hash() includes all includes transitively
  • Write test: TemplateDependencyTracker.compute_hash() is deterministic
  • Write test: TemplateDependencyTracker raises error on circular dependency

Template Selection

  • Write test: TemplateSelector.select() uses frontmatter template if present
  • Write test: TemplateSelector.select() uses category template if exists
  • Write test: TemplateSelector.select() falls back to default.html
  • Write test: TemplateSelector.resolve_category_template() checks file exists
  • Implement TemplateSelector class
  • Write test: TemplateSelector raises error if default.html missing

Template Rendering

  • Write test: TemplateRenderer.load_templates() loads from directory
  • Write test: TemplateRenderer.render() produces HTML
  • Write test: TemplateRenderer.render() provides content in context
  • Write test: TemplateRenderer.render() provides metadata in context
  • Write test: TemplateRenderer.get_templates_used() tracks includes
  • Implement TemplateRenderer with Jinja2
  • Write test: TemplateRenderer handles missing template gracefully

Phase 5: URL System

URL Normalization

  • Write test: URLNormalizer.normalize() adds trailing slash
  • Write test: URLNormalizer.normalize() lowercases URL
  • Write test: URLNormalizer.normalize() collapses multiple slashes
  • Write test: URLNormalizer.normalize() ensures leading slash
  • Implement URLNormalizer class
  • Write test: URLNormalizer.normalize() is idempotent

Permalink Generation

  • Write test: PermalinkGenerator.parse_template() finds placeholders
  • Write test: PermalinkGenerator.parse_template() handles format specs ({year:04d})
  • Write test: PermalinkGenerator.apply_template() substitutes category
  • Write test: PermalinkGenerator.apply_template() substitutes date parts
  • Write test: PermalinkGenerator.apply_template() substitutes slug
  • Implement PermalinkGenerator class
  • Write test: PermalinkGenerator.generate() returns normalized URL
  • Write test: PermalinkGenerator.resolve_output_path() appends index.html
  • Write test: PermalinkGenerator validates template has required placeholders

Collision Detection

  • Write test: CollisionDetector.build_url_map() creates mapping
  • Write test: CollisionDetector.find_collisions() detects duplicate URLs
  • Write test: CollisionDetector.find_collisions() normalizes before comparing
  • Write test: CollisionDetector.format_error() lists all source files
  • Implement CollisionDetector class
  • Write test: CollisionDetector.detect() handles empty list
  • Write test: CollisionDetector catches index URL vs content URL collision

Phase 6: Content Processing

Markdown Transformation

  • Write test: MarkdownTransformer.transform() converts headers
  • Write test: MarkdownTransformer.transform() converts lists
  • Write test: MarkdownTransformer.transform() handles code blocks
  • Write test: MarkdownTransformer.configure_extensions() enables extensions
  • Implement MarkdownTransformer class
  • Write test: MarkdownTransformer with tables extension

Content Processor

  • Write test: ContentProcessor.read_file() loads content
  • Write test: ContentProcessor.parse_content() splits frontmatter and body
  • Write test: ContentProcessor.should_process() returns true for .md files
  • Write test: ContentProcessor.process() combines all steps
  • Implement ContentProcessor class
  • Write test: ContentProcessor.process() sets templates_used
  • Write test: ContentProcessor.process() resolves metadata
  • Write test: ContentProcessor.process() renders through template

Phase 7: Cache System

Cache Key Computation

  • Write test: CacheManager.compute_cache_key() includes content hash
  • Write test: CacheManager.compute_cache_key() includes metadata
  • Write test: CacheManager.compute_cache_key() includes template hash
  • Write test: CacheManager.compute_cache_key() includes permalink template
  • Write test: CacheManager.compute_cache_key() is deterministic
  • Write test: CacheManager.compute_cache_key() changes when content changes
  • Implement CacheManager.compute_cache_key()

Cache Operations

  • Write test: CacheManager.needs_processing() returns true on cold start
  • Write test: CacheManager.needs_processing() returns false on cache hit
  • Write test: CacheManager.needs_processing() returns true when key differs
  • Write test: CacheManager.load_cached() returns stored HTML
  • Write test: CacheManager.mark_processed() stores cache entry
  • Implement CacheManager class with in-memory storage
  • Write test: CacheManager persists across instances

Manifest Management

  • Write test: ManifestManager.load() parses JSON manifest
  • Write test: ManifestManager.load() returns None if missing
  • Write test: ManifestManager.save_atomic() writes to temp then renames
  • Write test: ManifestManager.save_atomic() includes all item fields
  • Write test: ManifestManager.compare() finds orphaned files
  • Implement ManifestManager class
  • Write test: ManifestManager handles corrupted manifest gracefully
  • Write test: ManifestManager includes template hashes in manifest

Phase 8: Index System

Pagination

  • Write test: Paginator.paginate() splits items by page_size
  • Write test: Paginator.paginate() handles empty list
  • Write test: Paginator.paginate() handles partial last page
  • Write test: Paginator.get_page_url() generates /page/2/ format
  • Write test: Paginator.get_page_url() generates /category/page/2/ for categories
  • Write test: Paginator.get_page_url() generates /index.html for page 1
  • Implement Paginator class

Index Generation

  • Write test: IndexGenerator.create_main_index() sorts by date desc
  • Write test: IndexGenerator.create_main_index() paginates correctly
  • Write test: IndexGenerator.create_category_indexes() groups by category
  • Write test: IndexGenerator.compute_cache_key() includes items_on_page
  • Write test: IndexGenerator.compute_cache_key() includes pagination context
  • Write test: IndexGenerator.compute_cache_key() includes total_pages
  • Implement IndexGenerator class
  • Write test: IndexGenerator.should_rebuild() detects membership change
  • Write test: IndexGenerator.should_rebuild() detects ordering change
  • Write test: IndexGenerator.generate() creates IndexItems with correct URLs

Phase 9: Stage 1 - SCAN

Directory Walking

  • Write test: ScanStage.walk_directory() finds all files
  • Write test: ScanStage.walk_directory() skips blacklisted paths
  • Write test: ScanStage.walk_directory() skips dot directories
  • Write test: ScanStage.walk_directory() skips output directory
  • Write test: ScanStage.should_ignore() matches blacklist patterns
  • Implement ScanStage.walk_directory()

File Classification

  • Write test: ScanStage.classify_file() returns ‘content’ for .md
  • Write test: ScanStage.classify_file() returns ‘asset’ for .css
  • Write test: ScanStage.classify_file() returns ‘asset’ for images
  • Implement ScanStage.classify_file()

BuildItem Creation

  • Write test: ScanStage.create_build_item() extracts initial_slug from filename
  • Write test: ScanStage.create_build_item() extracts initial_category from parent
  • Write test: ScanStage.create_build_item() handles top-level files
  • Write test: ScanStage.create_build_item() captures file stats
  • Implement ScanStage.create_build_item()

Scan Integration

  • Write test: ScanStage.run() returns sorted list of BuildItems
  • Write test: ScanStage.run() marks all items as SCANNED
  • Write test: ScanStage.run() on sample directory structure
  • Implement ScanStage.run()
  • Write test: ScanStage deterministic ordering (same input → same output)

Phase 10: Stage 2 - BUILD Phase 1

Content Item Processing

  • Write test: BuildStage.process_content_item() reads file
  • Write test: BuildStage.process_content_item() parses frontmatter
  • Write test: BuildStage.process_content_item() merges metadata
  • Write test: BuildStage.process_content_item() transforms markdown
  • Write test: BuildStage.process_content_item() renders template
  • Write test: BuildStage.process_content_item() computes cache_key
  • Write test: BuildStage.process_content_item() generates permalink
  • Implement BuildStage.process_content_item()

Asset Item Processing

  • Write test: BuildStage.process_asset_item() computes cache_key
  • Write test: BuildStage.process_asset_item() generates output path
  • Write test: BuildStage.process_asset_item() preserves relative path
  • Implement BuildStage.process_asset_item()

Cache Integration

  • Write test: BuildStage.run_phase_one() checks cache before processing
  • Write test: BuildStage.run_phase_one() loads cached HTML on hit
  • Write test: BuildStage.run_phase_one() processes only cache misses
  • Write test: BuildStage.run_phase_one() assigns cache_key to all items
  • Implement BuildStage.run_phase_one()

Error Collection

  • Write test: BuildStage.collect_errors() accumulates parse errors
  • Write test: BuildStage.collect_errors() accumulates template errors
  • Write test: BuildStage.collect_errors() continues processing after errors
  • Implement BuildStage.collect_errors()

Phase 11: Stage 2 - BUILD Phase 2

Index Creation

  • Write test: BuildStage.run_phase_two() generates main index
  • Write test: BuildStage.run_phase_two() generates category indexes
  • Write test: BuildStage.run_phase_two() uses cache_keys from phase 1
  • Write test: BuildStage.run_phase_two() computes index cache keys
  • Write test: BuildStage.run_phase_two() checks index cache
  • Write test: BuildStage.run_phase_two() renders index templates
  • Implement BuildStage.run_phase_two()

Collision Detection Integration

  • Write test: BuildStage.detect_collisions() checks content items
  • Write test: BuildStage.detect_collisions() checks index items
  • Write test: BuildStage.detect_collisions() normalizes URLs first
  • Write test: BuildStage.detect_collisions() returns CollisionErrors
  • Implement BuildStage.detect_collisions()

Build Integration

  • Write test: BuildStage.run() executes phase 1 then phase 2
  • Write test: BuildStage.run() detects collisions after phase 2
  • Write test: BuildStage.run() returns errors if any found
  • Write test: BuildStage.run() returns built items if no errors
  • Implement BuildStage.run()
  • Write test: BuildStage.run() on complete sample site

Phase 12: Stage 3 - WRITE

Output Writing

  • Write test: OutputWriter.write_text() creates parent directories
  • Write test: OutputWriter.write_text() writes UTF-8 content
  • Write test: OutputWriter.copy_file() preserves mtime for assets
  • Write test: OutputWriter.ensure_directory() creates nested paths
  • Write test: OutputWriter.fsync() ensures durability
  • Implement OutputWriter class

Timestamped Directories

  • Write test: WriteStage.create_timestamped_directory() uses YYYYMMDD_HHMMSS format
  • Write test: WriteStage.create_timestamped_directory() creates directory
  • Implement WriteStage.create_timestamped_directory()

Atomic Swapping

  • Write test: AtomicSwapper.verify_symlink_support() detects platform support
  • Write test: AtomicSwapper.create_symlink() creates symlink
  • Write test: AtomicSwapper.update_symlink_atomic() uses temp+rename pattern
  • Write test: AtomicSwapper.swap() updates symlink atomically
  • Implement AtomicSwapper class
  • Write test: AtomicSwapper.swap() preserves old symlink target on failure

Manifest Writing

  • Write test: WriteStage.write_manifest() before symlink update
  • Write test: WriteStage.write_manifest() includes all items
  • Write test: WriteStage.write_manifest() includes template hashes
  • Write test: WriteStage.write_manifest() uses atomic write
  • Implement WriteStage.write_manifest()

Cleanup

  • Write test: OutputCleaner.find_orphans() compares manifests
  • Write test: OutputCleaner.delete_old_outputs() keeps N recent
  • Write test: OutputCleaner.keep_recent() sorts by timestamp
  • Write test: OutputCleaner.cleanup() removes orphan files
  • Implement OutputCleaner class
  • Write test: OutputCleaner.cleanup() skips non-empty directories

Write Integration

  • Write test: WriteStage.run() creates timestamped directory
  • Write test: WriteStage.run() writes all items to temp directory
  • Write test: WriteStage.run() writes manifest
  • Write test: WriteStage.run() updates symlink
  • Write test: WriteStage.run() cleans old outputs
  • Write test: WriteStage.run() on complete sample site
  • Implement WriteStage.run()
  • Write test: WriteStage.run() leaves previous output on failure

Phase 13: Pipeline Integration

Build Context

  • Write test: BuildContext.add_item() stores item
  • Write test: BuildContext.get_items() returns all items
  • Write test: BuildContext filters items by state
  • Implement BuildContext class

Pipeline Orchestration

  • Write test: Pipeline.run() executes SCAN stage
  • Write test: Pipeline.run() executes BUILD stage
  • Write test: Pipeline.run() executes WRITE stage
  • Write test: Pipeline.run() aborts before WRITE if BUILD errors
  • Write test: Pipeline.run() returns success on complete build
  • Implement Pipeline.run()
  • Write test: Pipeline.run_dry() skips WRITE stage
  • Implement Pipeline.run_dry()

End-to-End Tests

  • Write test: Full pipeline on minimal site (1 page, 1 asset)
  • Write test: Full pipeline on multi-category site
  • Write test: Full pipeline with pagination (11+ posts)
  • Write test: Incremental build (change 1 file, rebuild)
  • Write test: Template change invalidation
  • Write test: Collision detection prevents build
  • Write test: Invalid frontmatter aborts before WRITE
  • Write test: Second build uses cache (fast rebuild)

Phase 14: Edge Cases & Robustness

Encoding & Parsing

  • Write test: ContentProcessor handles UTF-8 with BOM
  • Write test: ContentProcessor handles mixed line endings
  • Write test: MetadataExtractor handles TOML frontmatter
  • Write test: MetadataExtractor handles missing frontmatter delimiter

Template Edge Cases

  • Write test: TemplateRenderer handles template with no includes
  • Write test: TemplateRenderer handles deeply nested includes
  • Write test: TemplateDependencyTracker handles partial with same name as template

Permalink Edge Cases

  • Write test: PermalinkGenerator handles missing date in metadata
  • Write test: PermalinkGenerator handles empty category
  • Write test: PermalinkGenerator handles special characters in slug
  • Write test: PermalinkGenerator validates required placeholders present

Cache Edge Cases

  • Write test: CacheManager handles corrupted cache gracefully
  • Write test: CacheManager handles schema version mismatch
  • Write test: ManifestManager handles missing manifest file

Write Edge Cases

  • Write test: WriteStage handles disk full error
  • Write test: WriteStage handles permission denied
  • Write test: AtomicSwapper handles existing symlink
  • Write test: AtomicSwapper rollback on failure
  • Write test: OutputCleaner handles permission errors on delete

Index Edge Cases

  • Write test: IndexGenerator handles zero posts
  • Write test: IndexGenerator handles exactly one page of posts
  • Write test: Paginator handles page_size=1
  • Write test: IndexGenerator handles category with one post

Phase 15: Performance Validation

Benchmarks

  • Benchmark: SCAN 1000 files
  • Benchmark: BUILD 1000 posts (cold)
  • Benchmark: BUILD 1000 posts with 1 change (incremental)
  • Benchmark: Template change affecting 100 posts
  • Benchmark: WRITE 1000 files
  • Benchmark: Full pipeline 1000 posts

Memory Profiling

  • Profile: Memory usage during BUILD with 1000 posts
  • Profile: Memory usage during index generation
  • Test: Verify no memory leaks across multiple builds

Stress Tests

  • Test: Build with 5000 posts
  • Test: Site with 50 categories
  • Test: Site with 1000 tags
  • Test: Deeply nested directory structure (10+ levels)
  • Test: Very long post (10MB markdown file)
  • Test: 100+ posts in single category (pagination)

Implementation Notes

Test Fixtures Structure

tests/
├── fixtures/
│   ├── minimal_site/
│   │   ├── content/
│   │   │   └── hello.md
│   │   ├── templates/
│   │   │   └── default.html
│   │   └── config.toml
│   ├── multi_category/
│   │   ├── content/
│   │   │   ├── python/
│   │   │   │   ├── intro.md
│   │   │   │   └── advanced.md
│   │   │   └── rust/
│   │   │       └── ownership.md
│   │   └── templates/
│   │       ├── default.html
│   │       └── python.html
│   └── pagination_site/
│       ├── content/
│       │   └── posts/ (15 markdown files)
│       └── templates/
│           ├── default.html
│           └── index.html
└── unit/
    ├── test_models.py
    ├── test_metadata.py
    ├── test_templates.py
    ├── test_urls.py
    ├── test_cache.py
    ├── test_indexes.py
    └── test_stages.py