Anno 117: Pax Romana Documentation

Structured API-like documentation of all game elements in Anno 117: Pax Romana.

Workflow

1. Scrape pages

# Scrape 1 page (default)
venv/bin/python python/scraper.py

# Scrape multiple pages
venv/bin/python python/scraper.py -n 10

# Scrape all remaining pages
venv/bin/python python/scraper.py -n 9999

This scrapes unchecked URLs from scraping.md, saves JSON to scraped_data/, and adds them to processed.md as pending.

2. Process scraped data into docs/

# Process 1 file (default)
venv/bin/python python/process.py

# Process 5 files
venv/bin/python python/process.py -n 5

# Process all remaining files
venv/bin/python python/process.py -n 9999

# Process in parallel (e.g., 4 workers processing 10 files each)
for i in {1..4}; do venv/bin/python python/process.py -n 10 & done; wait

The script uses file locking to safely run in parallel. Each invocation:

Claims pending JSON files from processed.md
Calls Claude to parse them into the docs/ folder structure
Marks them as completed

File Structure

scraping.md - URLs to scrape (checkboxes track progress)
processed.md - JSON files pending/processed into docs/
scraped_data/ - Raw scraped JSON files
docs/ - Structured documentation (see CLAUDE.md for structure)
python/scraper.py - Web scraper script
python/process.py - Process one JSON file into docs/

Data Flow

anno.land pages
      ↓ (scraper.py)
scraped_data/*.json
      ↓ (process_one.py → Claude)
docs/**/*.md