Ssg_implementation_checklist

07 Nov 2025

SSGv3 Implementation Checklist - Test-First Approach

Testing Strategy

Unit tests: Test each component in isolation with mocks
Integration tests: Test stage interactions with real file fixtures
E2E tests: Full pipeline runs on sample sites
Property tests: Use hypothesis for cache key determinism, URL normalization idempotence

Test Data Principles

Use realistic content (actual markdown with frontmatter)
Cover boundary conditions (empty files, missing fields)
Include unicode in slugs and content
Test with various date formats and timezones
Include invalid inputs to verify error handling

Phase 1: Foundation - Data Structures & Models

Core Models

Write test: BuildItem creation with minimal fields
Write test: BuildItem state transitions (SCANNED → BUILT → WRITTEN)
Write test: BuildItem.to_dict() serialization
Write test: BuildItem.from_dict() deserialization
Implement BuildItem base class
Write test: ContentItem with templates_used list
Write test: AssetItem with preserve_mtime flag
Write test: IndexItem with pagination fields
Implement ContentItem, AssetItem, IndexItem subclasses

Error Models

Write test: SSGError with message and suggestion
Write test: BuildError with code, src, message
Write test: CollisionError with url and multiple sources
Write test: TemplateError for missing template
Implement error class hierarchy
Write test: ErrorCollector accumulates multiple errors
Write test: ErrorCollector.format_report() output
Implement ErrorCollector class

Configuration

Write test: Config loads from TOML with all fields
Write test: Config validates content_dir exists
Write test: Config validates output_dir not inside content_dir
Write test: Config validates permalink template syntax
Write test: Config blacklist includes default patterns
Implement Config dataclass with validation
Write test: Config detects invalid page_size (< 1)
Write test: Config normalizes paths to absolute

Phase 2: Utilities Layer

File Utilities

Write test: FileUtils.read_file() handles UTF-8
Write test: FileUtils.read_file() raises on invalid encoding
Write test: FileUtils.stat_file() returns mtime and size
Write test: FileUtils.is_ignored() matches patterns
Implement FileUtils class
Write test: FileUtils handles symlinks (don’t follow by default)

Date Parsing

Write test: DateParser.parse() handles ISO format
Write test: DateParser.parse() handles “YYYY-MM-DD” format
Write test: DateParser.parse() handles human formats (“Jan 15, 2025”)
Write test: DateParser.to_iso() produces standard format
Write test: DateParser.extract_from_filename() finds “YYYY-MM-DD-title.md”
Implement DateParser class
Write test: DateParser handles timezones consistently

Path Resolution

Write test: PathResolver.resolve() makes paths absolute
Write test: PathResolver.make_relative() produces relative path
Write test: PathResolver.is_within() detects containment
Write test: PathResolver.is_within() rejects traversal attempts
Implement PathResolver class

Slug Generation

Write test: SlugGenerator.generate() lowercases input
Write test: SlugGenerator.generate() transliterates unicode (ö→o, é→e)
Write test: SlugGenerator.generate() removes punctuation except hyphens
Write test: SlugGenerator.generate() collapses whitespace to single hyphen
Write test: SlugGenerator.generate() strips leading/trailing hyphens
Implement SlugGenerator class
Write test: SlugGenerator with preserve_unicode=True keeps diacritics

Phase 3: Metadata System

Metadata Extraction

Write test: MetadataExtractor.parse_frontmatter() extracts YAML
Write test: MetadataExtractor.parse_frontmatter() handles empty frontmatter
Write test: MetadataExtractor.parse_frontmatter() raises on invalid YAML
Write test: MetadataExtractor.get_system_defaults() uses filename for slug
Write test: MetadataExtractor.get_system_defaults() uses mtime for date
Write test: MetadataExtractor.get_path_derived() extracts category from parent
Write test: MetadataExtractor.get_path_derived() handles top-level files
Implement MetadataExtractor class
Write test: MetadataExtractor.merge_metadata() respects precedence
Write test: MetadataExtractor.extract() combines all three sources
Write test: Frontmatter category overrides directory structure

Phase 4: Template System

Template Dependency Tracking

Write test: TemplateDependencyTracker.build_graph() finds includes
Write test: TemplateDependencyTracker.build_graph() parses Jinja2
Write test: TemplateDependencyTracker.detect_cycles() finds A→B→A
Write test: TemplateDependencyTracker.detect_cycles() handles self-reference
Write test: TemplateDependencyTracker.detect_cycles() returns empty for DAG
Implement TemplateDependencyTracker class
Write test: TemplateDependencyTracker.compute_hash() includes content
Write test: TemplateDependencyTracker.compute_hash() includes all includes transitively
Write test: TemplateDependencyTracker.compute_hash() is deterministic
Write test: TemplateDependencyTracker raises error on circular dependency

Template Selection

Write test: TemplateSelector.select() uses frontmatter template if present
Write test: TemplateSelector.select() uses category template if exists
Write test: TemplateSelector.select() falls back to default.html
Write test: TemplateSelector.resolve_category_template() checks file exists
Implement TemplateSelector class
Write test: TemplateSelector raises error if default.html missing

Template Rendering

Write test: TemplateRenderer.load_templates() loads from directory
Write test: TemplateRenderer.render() produces HTML
Write test: TemplateRenderer.render() provides content in context
Write test: TemplateRenderer.render() provides metadata in context
Write test: TemplateRenderer.get_templates_used() tracks includes
Implement TemplateRenderer with Jinja2
Write test: TemplateRenderer handles missing template gracefully

Phase 5: URL System

URL Normalization

Write test: URLNormalizer.normalize() adds trailing slash
Write test: URLNormalizer.normalize() lowercases URL
Write test: URLNormalizer.normalize() collapses multiple slashes
Write test: URLNormalizer.normalize() ensures leading slash
Implement URLNormalizer class
Write test: URLNormalizer.normalize() is idempotent

Permalink Generation

Write test: PermalinkGenerator.parse_template() finds placeholders
Write test: PermalinkGenerator.parse_template() handles format specs ({year:04d})
Write test: PermalinkGenerator.apply_template() substitutes category
Write test: PermalinkGenerator.apply_template() substitutes date parts
Write test: PermalinkGenerator.apply_template() substitutes slug
Implement PermalinkGenerator class
Write test: PermalinkGenerator.generate() returns normalized URL
Write test: PermalinkGenerator.resolve_output_path() appends index.html
Write test: PermalinkGenerator validates template has required placeholders

Collision Detection

Write test: CollisionDetector.build_url_map() creates mapping
Write test: CollisionDetector.find_collisions() detects duplicate URLs
Write test: CollisionDetector.find_collisions() normalizes before comparing
Write test: CollisionDetector.format_error() lists all source files
Implement CollisionDetector class
Write test: CollisionDetector.detect() handles empty list
Write test: CollisionDetector catches index URL vs content URL collision

Phase 6: Content Processing

Markdown Transformation

Write test: MarkdownTransformer.transform() converts headers
Write test: MarkdownTransformer.transform() converts lists
Write test: MarkdownTransformer.transform() handles code blocks
Write test: MarkdownTransformer.configure_extensions() enables extensions
Implement MarkdownTransformer class
Write test: MarkdownTransformer with tables extension

Content Processor

Write test: ContentProcessor.read_file() loads content
Write test: ContentProcessor.parse_content() splits frontmatter and body
Write test: ContentProcessor.should_process() returns true for .md files
Write test: ContentProcessor.process() combines all steps
Implement ContentProcessor class
Write test: ContentProcessor.process() sets templates_used
Write test: ContentProcessor.process() resolves metadata
Write test: ContentProcessor.process() renders through template

Phase 7: Cache System

Cache Key Computation

Write test: CacheManager.compute_cache_key() includes content hash
Write test: CacheManager.compute_cache_key() includes metadata
Write test: CacheManager.compute_cache_key() includes template hash
Write test: CacheManager.compute_cache_key() includes permalink template
Write test: CacheManager.compute_cache_key() is deterministic
Write test: CacheManager.compute_cache_key() changes when content changes
Implement CacheManager.compute_cache_key()

Cache Operations

Write test: CacheManager.needs_processing() returns true on cold start
Write test: CacheManager.needs_processing() returns false on cache hit
Write test: CacheManager.needs_processing() returns true when key differs
Write test: CacheManager.load_cached() returns stored HTML
Write test: CacheManager.mark_processed() stores cache entry
Implement CacheManager class with in-memory storage
Write test: CacheManager persists across instances

Manifest Management

Write test: ManifestManager.load() parses JSON manifest
Write test: ManifestManager.load() returns None if missing
Write test: ManifestManager.save_atomic() writes to temp then renames
Write test: ManifestManager.save_atomic() includes all item fields
Write test: ManifestManager.compare() finds orphaned files
Implement ManifestManager class
Write test: ManifestManager handles corrupted manifest gracefully
Write test: ManifestManager includes template hashes in manifest

Phase 8: Index System

Pagination

Write test: Paginator.paginate() splits items by page_size
Write test: Paginator.paginate() handles empty list
Write test: Paginator.paginate() handles partial last page
Write test: Paginator.get_page_url() generates /page/2/ format
Write test: Paginator.get_page_url() generates /category/page/2/ for categories
Write test: Paginator.get_page_url() generates /index.html for page 1
Implement Paginator class

Index Generation

Write test: IndexGenerator.create_main_index() sorts by date desc
Write test: IndexGenerator.create_main_index() paginates correctly
Write test: IndexGenerator.create_category_indexes() groups by category
Write test: IndexGenerator.compute_cache_key() includes items_on_page
Write test: IndexGenerator.compute_cache_key() includes pagination context
Write test: IndexGenerator.compute_cache_key() includes total_pages
Implement IndexGenerator class
Write test: IndexGenerator.should_rebuild() detects membership change
Write test: IndexGenerator.should_rebuild() detects ordering change
Write test: IndexGenerator.generate() creates IndexItems with correct URLs

Phase 9: Stage 1 - SCAN

Directory Walking

Write test: ScanStage.walk_directory() finds all files
Write test: ScanStage.walk_directory() skips blacklisted paths
Write test: ScanStage.walk_directory() skips dot directories
Write test: ScanStage.walk_directory() skips output directory
Write test: ScanStage.should_ignore() matches blacklist patterns
Implement ScanStage.walk_directory()

File Classification

Write test: ScanStage.classify_file() returns ‘content’ for .md
Write test: ScanStage.classify_file() returns ‘asset’ for .css
Write test: ScanStage.classify_file() returns ‘asset’ for images
Implement ScanStage.classify_file()

BuildItem Creation

Write test: ScanStage.create_build_item() extracts initial_slug from filename
Write test: ScanStage.create_build_item() extracts initial_category from parent
Write test: ScanStage.create_build_item() handles top-level files
Write test: ScanStage.create_build_item() captures file stats
Implement ScanStage.create_build_item()

Scan Integration

Write test: ScanStage.run() returns sorted list of BuildItems
Write test: ScanStage.run() marks all items as SCANNED
Write test: ScanStage.run() on sample directory structure
Implement ScanStage.run()
Write test: ScanStage deterministic ordering (same input → same output)

Phase 10: Stage 2 - BUILD Phase 1

Content Item Processing

Write test: BuildStage.process_content_item() reads file
Write test: BuildStage.process_content_item() parses frontmatter
Write test: BuildStage.process_content_item() merges metadata
Write test: BuildStage.process_content_item() transforms markdown
Write test: BuildStage.process_content_item() renders template
Write test: BuildStage.process_content_item() computes cache_key
Write test: BuildStage.process_content_item() generates permalink
Implement BuildStage.process_content_item()

Asset Item Processing

Write test: BuildStage.process_asset_item() computes cache_key
Write test: BuildStage.process_asset_item() generates output path
Write test: BuildStage.process_asset_item() preserves relative path
Implement BuildStage.process_asset_item()

Cache Integration

Write test: BuildStage.run_phase_one() checks cache before processing
Write test: BuildStage.run_phase_one() loads cached HTML on hit
Write test: BuildStage.run_phase_one() processes only cache misses
Write test: BuildStage.run_phase_one() assigns cache_key to all items
Implement BuildStage.run_phase_one()

Error Collection

Write test: BuildStage.collect_errors() accumulates parse errors
Write test: BuildStage.collect_errors() accumulates template errors
Write test: BuildStage.collect_errors() continues processing after errors
Implement BuildStage.collect_errors()

Phase 11: Stage 2 - BUILD Phase 2

Index Creation

Write test: BuildStage.run_phase_two() generates main index
Write test: BuildStage.run_phase_two() generates category indexes
Write test: BuildStage.run_phase_two() uses cache_keys from phase 1
Write test: BuildStage.run_phase_two() computes index cache keys
Write test: BuildStage.run_phase_two() checks index cache
Write test: BuildStage.run_phase_two() renders index templates
Implement BuildStage.run_phase_two()

Collision Detection Integration

Write test: BuildStage.detect_collisions() checks content items
Write test: BuildStage.detect_collisions() checks index items
Write test: BuildStage.detect_collisions() normalizes URLs first
Write test: BuildStage.detect_collisions() returns CollisionErrors
Implement BuildStage.detect_collisions()

Build Integration

Write test: BuildStage.run() executes phase 1 then phase 2
Write test: BuildStage.run() detects collisions after phase 2
Write test: BuildStage.run() returns errors if any found
Write test: BuildStage.run() returns built items if no errors
Implement BuildStage.run()
Write test: BuildStage.run() on complete sample site

Phase 12: Stage 3 - WRITE

Output Writing

Write test: OutputWriter.write_text() creates parent directories
Write test: OutputWriter.write_text() writes UTF-8 content
Write test: OutputWriter.copy_file() preserves mtime for assets
Write test: OutputWriter.ensure_directory() creates nested paths
Write test: OutputWriter.fsync() ensures durability
Implement OutputWriter class

Timestamped Directories

Write test: WriteStage.create_timestamped_directory() uses YYYYMMDD_HHMMSS format
Write test: WriteStage.create_timestamped_directory() creates directory
Implement WriteStage.create_timestamped_directory()

Atomic Swapping

Write test: AtomicSwapper.verify_symlink_support() detects platform support
Write test: AtomicSwapper.create_symlink() creates symlink
Write test: AtomicSwapper.update_symlink_atomic() uses temp+rename pattern
Write test: AtomicSwapper.swap() updates symlink atomically
Implement AtomicSwapper class
Write test: AtomicSwapper.swap() preserves old symlink target on failure

Manifest Writing

Write test: WriteStage.write_manifest() before symlink update
Write test: WriteStage.write_manifest() includes all items
Write test: WriteStage.write_manifest() includes template hashes
Write test: WriteStage.write_manifest() uses atomic write
Implement WriteStage.write_manifest()

Cleanup

Write test: OutputCleaner.find_orphans() compares manifests
Write test: OutputCleaner.delete_old_outputs() keeps N recent
Write test: OutputCleaner.keep_recent() sorts by timestamp
Write test: OutputCleaner.cleanup() removes orphan files
Implement OutputCleaner class
Write test: OutputCleaner.cleanup() skips non-empty directories

Write Integration

Write test: WriteStage.run() creates timestamped directory
Write test: WriteStage.run() writes all items to temp directory
Write test: WriteStage.run() writes manifest
Write test: WriteStage.run() updates symlink
Write test: WriteStage.run() cleans old outputs
Write test: WriteStage.run() on complete sample site
Implement WriteStage.run()
Write test: WriteStage.run() leaves previous output on failure

Phase 13: Pipeline Integration

Build Context

Write test: BuildContext.add_item() stores item
Write test: BuildContext.get_items() returns all items
Write test: BuildContext filters items by state
Implement BuildContext class

Pipeline Orchestration

Write test: Pipeline.run() executes SCAN stage
Write test: Pipeline.run() executes BUILD stage
Write test: Pipeline.run() executes WRITE stage
Write test: Pipeline.run() aborts before WRITE if BUILD errors
Write test: Pipeline.run() returns success on complete build
Implement Pipeline.run()
Write test: Pipeline.run_dry() skips WRITE stage
Implement Pipeline.run_dry()

End-to-End Tests

Write test: Full pipeline on minimal site (1 page, 1 asset)
Write test: Full pipeline on multi-category site
Write test: Full pipeline with pagination (11+ posts)
Write test: Incremental build (change 1 file, rebuild)
Write test: Template change invalidation
Write test: Collision detection prevents build
Write test: Invalid frontmatter aborts before WRITE
Write test: Second build uses cache (fast rebuild)

Phase 14: Edge Cases & Robustness

Encoding & Parsing

Write test: ContentProcessor handles UTF-8 with BOM
Write test: ContentProcessor handles mixed line endings
Write test: MetadataExtractor handles TOML frontmatter
Write test: MetadataExtractor handles missing frontmatter delimiter

Template Edge Cases

Write test: TemplateRenderer handles template with no includes
Write test: TemplateRenderer handles deeply nested includes
Write test: TemplateDependencyTracker handles partial with same name as template

Permalink Edge Cases

Write test: PermalinkGenerator handles missing date in metadata
Write test: PermalinkGenerator handles empty category
Write test: PermalinkGenerator handles special characters in slug
Write test: PermalinkGenerator validates required placeholders present

Cache Edge Cases

Write test: CacheManager handles corrupted cache gracefully
Write test: CacheManager handles schema version mismatch
Write test: ManifestManager handles missing manifest file

Write Edge Cases

Write test: WriteStage handles disk full error
Write test: WriteStage handles permission denied
Write test: AtomicSwapper handles existing symlink
Write test: AtomicSwapper rollback on failure
Write test: OutputCleaner handles permission errors on delete

Index Edge Cases

Write test: IndexGenerator handles zero posts
Write test: IndexGenerator handles exactly one page of posts
Write test: Paginator handles page_size=1
Write test: IndexGenerator handles category with one post

Phase 15: Performance Validation

Benchmarks

Benchmark: SCAN 1000 files
Benchmark: BUILD 1000 posts (cold)
Benchmark: BUILD 1000 posts with 1 change (incremental)
Benchmark: Template change affecting 100 posts
Benchmark: WRITE 1000 files
Benchmark: Full pipeline 1000 posts

Memory Profiling

Profile: Memory usage during BUILD with 1000 posts
Profile: Memory usage during index generation
Test: Verify no memory leaks across multiple builds

Stress Tests

Test: Build with 5000 posts
Test: Site with 50 categories
Test: Site with 1000 tags
Test: Deeply nested directory structure (10+ levels)
Test: Very long post (10MB markdown file)
Test: 100+ posts in single category (pagination)

Implementation Notes

Test Fixtures Structure

tests/
├── fixtures/
│   ├── minimal_site/
│   │   ├── content/
│   │   │   └── hello.md
│   │   ├── templates/
│   │   │   └── default.html
│   │   └── config.toml
│   ├── multi_category/
│   │   ├── content/
│   │   │   ├── python/
│   │   │   │   ├── intro.md
│   │   │   │   └── advanced.md
│   │   │   └── rust/
│   │   │       └── ownership.md
│   │   └── templates/
│   │       ├── default.html
│   │       └── python.html
│   └── pagination_site/
│       ├── content/
│       │   └── posts/ (15 markdown files)
│       └── templates/
│           ├── default.html
│           └── index.html
└── unit/
    ├── test_models.py
    ├── test_metadata.py
    ├── test_templates.py
    ├── test_urls.py
    ├── test_cache.py
    ├── test_indexes.py
    └── test_stages.py