1.7 KiB
1.7 KiB
Anno 117: Pax Romana Documentation
Structured API-like documentation of all game elements in Anno 117: Pax Romana.
Workflow
1. Scrape pages
# Scrape 1 page (default)
venv/bin/python python/scraper.py
# Scrape multiple pages
venv/bin/python python/scraper.py -n 10
# Scrape all remaining pages
venv/bin/python python/scraper.py -n 9999
This scrapes unchecked URLs from scraping.md, saves JSON to scraped_data/, and adds them to processed.md as pending.
2. Process scraped data into docs/
# Process one file (can run multiple in parallel)
venv/bin/python python/process.py
# Process multiple in parallel (e.g., 4 at once)
for i in {1..4}; do venv/bin/python python/process.py & done; wait
# Process all remaining files (4 parallel workers)
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
wait
The script uses file locking to safely run in parallel. Each invocation:
- Claims one pending JSON file from
processed.md - Calls Claude to parse it into the
docs/folder structure - Marks it as completed
File Structure
scraping.md- URLs to scrape (checkboxes track progress)processed.md- JSON files pending/processed into docs/scraped_data/- Raw scraped JSON filesdocs/- Structured documentation (see CLAUDE.md for structure)python/scraper.py- Web scraper scriptpython/process.py- Process one JSON file into docs/
Data Flow
anno.land pages
↓ (scraper.py)
scraped_data/*.json
↓ (process_one.py → Claude)
docs/**/*.md