anno-117-docs/README.md

# Anno 117: Pax Romana Documentation

Structured API-like documentation of all game elements in Anno 117: Pax Romana.

## Workflow

### 1. Scrape pages

```bash
# Scrape 1 page (default)
venv/bin/python python/scraper.py

# Scrape multiple pages
venv/bin/python python/scraper.py -n 10

# Scrape all remaining pages
venv/bin/python python/scraper.py -n 9999
```

This scrapes unchecked URLs from `scraping.md`, saves JSON to `scraped_data/`, and adds them to `processed.md` as pending.

### 2. Process scraped data into docs/

```bash
# Process one file (can run multiple in parallel)
venv/bin/python python/process.py

# Process multiple in parallel (e.g., 4 at once)
for i in {1..4}; do venv/bin/python python/process.py & done; wait

# Process all remaining files (4 parallel workers)
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
while venv/bin/python python/process.py; do :; done &
wait
```

The script uses file locking to safely run in parallel. Each invocation:
1. Claims one pending JSON file from `processed.md`
2. Calls Claude to parse it into the `docs/` folder structure
3. Marks it as completed

---

## File Structure

- `scraping.md` - URLs to scrape (checkboxes track progress)
- `processed.md` - JSON files pending/processed into docs/
- `scraped_data/` - Raw scraped JSON files
- `docs/` - Structured documentation (see CLAUDE.md for structure)
- `python/scraper.py` - Web scraper script
- `python/process.py` - Process one JSON file into docs/

## Data Flow

```
anno.land pages
      ↓ (scraper.py)
scraped_data/*.json
      ↓ (process_one.py → Claude)
docs/**/*.md
```