Files
anno-117-docs/README.md
2025-12-30 15:42:12 +01:00

63 lines
1.7 KiB
Markdown

# Anno 117: Pax Romana Documentation
Structured API-like documentation of all game elements in Anno 117: Pax Romana.
## Workflow
### 1. Scrape pages
```bash
# Scrape 1 page (default)
venv/bin/python python/scraper.py
# Scrape multiple pages
venv/bin/python python/scraper.py -n 10
# Scrape all remaining pages
venv/bin/python python/scraper.py -n 9999
```
This scrapes unchecked URLs from `scraping.md`, saves JSON to `scraped_data/`, and adds them to `processed.md` as pending.
### 2. Process scraped data into docs/
```bash
# Process one file (can run multiple in parallel)
venv/bin/python python/process.py
# Process multiple in parallel (e.g., 4 at once)
for i in {1..4}; do venv/bin/python python/process.py & done; wait
# Process all remaining files (4 parallel workers)
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
wait
```
The script uses file locking to safely run in parallel. Each invocation:
1. Claims one pending JSON file from `processed.md`
2. Calls Claude to parse it into the `docs/` folder structure
3. Marks it as completed
---
## File Structure
- `scraping.md` - URLs to scrape (checkboxes track progress)
- `processed.md` - JSON files pending/processed into docs/
- `scraped_data/` - Raw scraped JSON files
- `docs/` - Structured documentation (see CLAUDE.md for structure)
- `python/scraper.py` - Web scraper script
- `python/process.py` - Process one JSON file into docs/
## Data Flow
```
anno.land pages
↓ (scraper.py)
scraped_data/*.json
↓ (process_one.py → Claude)
docs/**/*.md
```