fdfd1fa1d1aad5fd792bd729b20836fafa17371a
Anno 117: Pax Romana Documentation
Structured API-like documentation of all game elements in Anno 117: Pax Romana.
Workflow
1. Scrape pages
# Scrape 1 page (default)
venv/bin/python python/scraper.py
# Scrape multiple pages
venv/bin/python python/scraper.py -n 10
# Scrape all remaining pages
venv/bin/python python/scraper.py -n 9999
This scrapes unchecked URLs from scraping.md, saves JSON to scraped_data/, and adds them to processed.md as pending.
2. Process scraped data into docs.md
Use this prompt with Claude:
Prompt for Claude:
Process pending scraped JSON files into docs.md ONE FILE AT A TIME.
LOOP: Repeat these steps until all files are processed:
1. Read `processed.md` to find the FIRST pending JSON file (marked with `- [ ]`)
2. If no pending files remain, stop
3. Read ONLY that one file from `scraped_data/`
4. Extract game entities (buildings, goods, production chains, etc.)
5. Translate German names to English:
- Use the game's official English names where known
- Common translations:
- Latium = Latium (Roman region)
- Albion = Albion (Celtic region)
- Liberti = Liberti (Tier 1 Roman)
- Plebejer = Plebeians (Tier 2 Roman)
- Equites = Equites (Tier 3 Roman)
- Patrizier = Patricians (Tier 4 Roman)
- Wanderer = Waders (Tier 1 Celtic)
- Schmiede = Smiths (Tier 2 Celtic)
- Älteste = Elders (Tier 3 Celtic)
- Mercatoren = Mercators (Tier 4 Celtic)
- Edelmänner = Nobles (Tier 5 Celtic)
- Building/Good names: translate to English equivalents
6. Format data according to the schemas defined in docs.md
7. Add new entities or update existing ones in docs.md
8. Mark this JSON file as processed in `processed.md` (change `- [ ]` to `- [x]`)
9. REPEAT from step 1
Focus on extracting:
- Buildings: name, category, region, build costs, maintenance, workforce, cycle time, inputs/outputs, area effects, requirements
- Goods: name, category, produced by, consumed by
- Production chains: steps, ratios, cycle times
Keep entries concise. Mark unknown values as "Unknown" rather than guessing.
File Structure
scraping.md- URLs to scrape (checkboxes track progress)processed.md- JSON files pending/processed into docs.mdscraped_data/- Raw scraped JSON filesdocs.md- Final structured documentationpython/scraper.py- Web scraper script
Data Flow
anno.land pages
↓ (scraper.py)
scraped_data/*.json
↓ (Claude)
docs.md
Description
Languages
Python
100%