anno-117-docs/README.md

# Anno 117: Pax Romana Documentation

Structured API-like documentation of all game elements in Anno 117: Pax Romana.

## Workflow

### 1. Scrape pages

```bash
# Scrape 1 page (default)
venv/bin/python python/scraper.py

# Scrape multiple pages
venv/bin/python python/scraper.py -n 10

# Scrape all remaining pages
venv/bin/python python/scraper.py -n 9999
```

This scrapes unchecked URLs from `scraping.md`, saves JSON to `scraped_data/`, and adds them to `processed.md` as pending.

### 2. Process scraped data into docs/

```bash
# Process 1 file (default)
venv/bin/python python/process.py

# Process 5 files
venv/bin/python python/process.py -n 5

# Process all remaining files
venv/bin/python python/process.py -n 9999

# Process in parallel (e.g., 4 workers processing 10 files each)
for i in {1..4}; do venv/bin/python python/process.py -n 10 & done; wait
```

The script uses file locking to safely run in parallel. Each invocation:
1. Claims pending JSON files from `processed.md`
2. Calls Claude to parse them into the `docs/` folder structure
3. Marks them as completed

---

## File Structure

- `scraping.md` - URLs to scrape (checkboxes track progress)
- `processed.md` - JSON files pending/processed into docs/
- `scraped_data/` - Raw scraped JSON files
- `docs/` - Structured documentation (see CLAUDE.md for structure)
- `python/scraper.py` - Web scraper script
- `python/process.py` - Process one JSON file into docs/

## Data Flow

```
anno.land pages
      ↓ (scraper.py)
scraped_data/*.json
      ↓ (process_one.py → Claude)
docs/**/*.md
```