update
This commit is contained in:
40
README.md
40
README.md
@@ -6,18 +6,28 @@ Structured API-like documentation of all game elements in Anno 117: Pax Romana.
|
||||
|
||||
### 1. Scrape pages
|
||||
|
||||
**From anno.land:**
|
||||
```bash
|
||||
# Scrape 1 page (default)
|
||||
venv/bin/python python/scraper.py
|
||||
venv/bin/python python/scraper_anno_world.py
|
||||
|
||||
# Scrape multiple pages
|
||||
venv/bin/python python/scraper.py -n 10
|
||||
venv/bin/python python/scraper_anno_world.py -n 10
|
||||
|
||||
# Scrape all remaining pages
|
||||
venv/bin/python python/scraper.py -n 9999
|
||||
venv/bin/python python/scraper_anno_world.py -n 9999
|
||||
```
|
||||
|
||||
This scrapes unchecked URLs from `scraping.md`, saves JSON to `scraped_data/`, and adds them to `processed.md` as pending.
|
||||
**From IGN wiki:**
|
||||
```bash
|
||||
# Scrape 1 page (default)
|
||||
venv/bin/python python/scraper_ign.py
|
||||
|
||||
# Scrape multiple pages
|
||||
venv/bin/python python/scraper_ign.py -n 10
|
||||
```
|
||||
|
||||
This scrapes unchecked URLs from `scraping.md` (anno.land) or `scraping_ign.md` (IGN), saves JSON to `scraped_data/`, and adds them to `processed.md` as pending.
|
||||
|
||||
### 2. Process scraped data into docs/
|
||||
|
||||
@@ -56,18 +66,26 @@ Replace X with desired total (e.g., 20).
|
||||
|
||||
## File Structure
|
||||
|
||||
- `scraping.md` - URLs to scrape (checkboxes track progress)
|
||||
- `scraping.md` - anno.land URLs to scrape (checkboxes track progress)
|
||||
- `scraping_ign.md` - IGN wiki URLs to scrape (checkboxes track progress)
|
||||
- `processed.md` - JSON files pending/processed into docs/
|
||||
- `scraped_data/` - Raw scraped JSON files
|
||||
- `docs/` - Structured documentation (see CLAUDE.md for structure)
|
||||
- `python/scraper.py` - Web scraper script
|
||||
- `python/scraper_anno_world.py` - anno.land web scraper
|
||||
- `python/scraper_ign.py` - IGN wiki web scraper
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
anno.land pages
|
||||
↓ (scraper.py)
|
||||
scraped_data/*.json
|
||||
↓ (Claude sub-agents)
|
||||
docs/**/*.md
|
||||
anno.land pages IGN wiki pages
|
||||
↓ ↓
|
||||
(scraper_anno_world.py) (scraper_ign.py)
|
||||
↓ ↓
|
||||
└──────────┬─────────────┘
|
||||
↓
|
||||
scraped_data/*.json
|
||||
↓
|
||||
(Claude sub-agents)
|
||||
↓
|
||||
docs/**/*.md
|
||||
```
|
||||
Reference in New Issue
Block a user