2025-12-30 16:13:55 +01:00
2025-12-30 15:42:12 +01:00
2025-12-30 16:02:36 +01:00
2025-12-30 16:13:55 +01:00
2025-12-30 16:13:55 +01:00
2025-12-30 16:13:55 +01:00
2025-12-30 15:42:51 +01:00
2025-12-30 15:34:15 +01:00
2025-12-30 15:30:22 +01:00
2025-12-30 16:13:55 +01:00
2025-12-30 15:44:54 +01:00
2025-12-30 16:13:55 +01:00
2025-12-30 16:02:36 +01:00

Anno 117: Pax Romana Documentation

Structured API-like documentation of all game elements in Anno 117: Pax Romana.

Workflow

1. Scrape pages

# Scrape 1 page (default)
venv/bin/python python/scraper.py

# Scrape multiple pages
venv/bin/python python/scraper.py -n 10

# Scrape all remaining pages
venv/bin/python python/scraper.py -n 9999

This scrapes unchecked URLs from scraping.md, saves JSON to scraped_data/, and adds them to processed.md as pending.

2. Process scraped data into docs/

# Process 1 file (default)
venv/bin/python python/process.py

# Process 5 files
venv/bin/python python/process.py -n 5

# Process all remaining files
venv/bin/python python/process.py -n 9999

# Process in parallel (e.g., 4 workers processing 10 files each)
for i in {1..4}; do venv/bin/python python/process.py -n 10 & done; wait

The script uses file locking to safely run in parallel. Each invocation:

  1. Claims pending JSON files from processed.md
  2. Calls Claude to parse them into the docs/ folder structure
  3. Marks them as completed

File Structure

  • scraping.md - URLs to scrape (checkboxes track progress)
  • processed.md - JSON files pending/processed into docs/
  • scraped_data/ - Raw scraped JSON files
  • docs/ - Structured documentation (see CLAUDE.md for structure)
  • python/scraper.py - Web scraper script
  • python/process.py - Process one JSON file into docs/

Data Flow

anno.land pages
      ↓ (scraper.py)
scraped_data/*.json
      ↓ (process_one.py → Claude)
docs/**/*.md
Description
Anno 117: Pax Romana documentation
Readme 3.9 MiB
Languages
Python 100%