adding data
This commit is contained in:
80
README.md
80
README.md
@@ -1,7 +1,81 @@
|
||||
# Anno 117: Pax Romana documentation
|
||||
# Anno 117: Pax Romana Documentation
|
||||
|
||||
## Run scraper
|
||||
Structured API-like documentation of all game elements in Anno 117: Pax Romana.
|
||||
|
||||
## Workflow
|
||||
|
||||
### 1. Scrape pages
|
||||
|
||||
```bash
|
||||
source venv/bin/activate && python python/scraper.py
|
||||
# Scrape 1 page (default)
|
||||
venv/bin/python python/scraper.py
|
||||
|
||||
# Scrape multiple pages
|
||||
venv/bin/python python/scraper.py -n 10
|
||||
|
||||
# Scrape all remaining pages
|
||||
venv/bin/python python/scraper.py -n 9999
|
||||
```
|
||||
|
||||
This scrapes unchecked URLs from `scraping.md`, saves JSON to `scraped_data/`, and adds them to `processed.md` as pending.
|
||||
|
||||
### 2. Process scraped data into docs.md
|
||||
|
||||
Use this prompt with Claude:
|
||||
|
||||
---
|
||||
|
||||
**Prompt for Claude:**
|
||||
|
||||
```
|
||||
Process the pending scraped JSON files into docs.md.
|
||||
|
||||
1. Read `processed.md` to find pending JSON files (marked with `- [ ]`)
|
||||
2. For each pending file, read it from `scraped_data/`
|
||||
3. Extract game entities (buildings, goods, production chains, etc.)
|
||||
4. Translate German names to English:
|
||||
- Use the game's official English names where known
|
||||
- Common translations:
|
||||
- Latium = Latium (Roman region)
|
||||
- Albion = Albion (Celtic region)
|
||||
- Liberti = Liberti (Tier 1 Roman)
|
||||
- Plebejer = Plebeians (Tier 2 Roman)
|
||||
- Equites = Equites (Tier 3 Roman)
|
||||
- Patrizier = Patricians (Tier 4 Roman)
|
||||
- Wanderer = Waders (Tier 1 Celtic)
|
||||
- Schmiede = Smiths (Tier 2 Celtic)
|
||||
- Älteste = Elders (Tier 3 Celtic)
|
||||
- Mercatoren = Mercators (Tier 4 Celtic)
|
||||
- Edelmänner = Nobles (Tier 5 Celtic)
|
||||
- Building/Good names: translate to English equivalents
|
||||
5. Format data according to the schemas defined in docs.md
|
||||
6. Add new entities or update existing ones in docs.md
|
||||
7. Mark the JSON file as processed in `processed.md` by changing `- [ ]` to `- [x]`
|
||||
|
||||
Focus on extracting:
|
||||
- Buildings: name, category, region, build costs, maintenance, workforce, cycle time, inputs/outputs, area effects, requirements
|
||||
- Goods: name, category, produced by, consumed by
|
||||
- Production chains: steps, ratios, cycle times
|
||||
|
||||
Keep entries concise. Mark unknown values as "Unknown" rather than guessing.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
- `scraping.md` - URLs to scrape (checkboxes track progress)
|
||||
- `processed.md` - JSON files pending/processed into docs.md
|
||||
- `scraped_data/` - Raw scraped JSON files
|
||||
- `docs.md` - Final structured documentation
|
||||
- `python/scraper.py` - Web scraper script
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
anno.land pages
|
||||
↓ (scraper.py)
|
||||
scraped_data/*.json
|
||||
↓ (Claude)
|
||||
docs.md
|
||||
```
|
||||
Reference in New Issue
Block a user