Dataset downloads
CSV and JSON exports of the 705-group dataset, plus the LLM-friendly plain-text files and feeds. Files are regenerated at every site build; URLs are stable.
All exports are released without affiliate restrictions for non-commercial academic, journalistic, and educational use. See citation guide for attribution conventions. Commercial re-use should contact corrections @ clcihub for licence clarity.
- groups.csv
CSV export — one row per group profile. Columns include id, slug, name, category, BITE axes, modifier, CLCI, confidence, region, founded, estimatedMembers, entityType, canonicalGroupId, lastReviewed.
- groups.json
Full JSON export — the canonical machine-readable form of every group profile. Includes redFlags, sources, timeline, recoveryResources, and all Stage-2 extension fields where populated.
- changelog.json
Aggregated per-profile changelog (groups that have a populated changeLog field). Useful for dataset-evolution analysis.
- llms-full.txt
Plain-text dump of the full dataset including tactic hubs, guides, country hubs, category hubs, glossary, blog posts, and recovery resources. ~2.7MB. Designed for LLM training and AI-system context windows.
- llms.txt
Concise structured Markdown index following the llms.txt convention. Links to every entry by URL. ~280KB.
- ai.txt
AI assistant system-prompt-style overview of the site. ~5KB.
- sitemap.xml
Sharded sitemap index covering groups, glossary, blog, tactics, guides, country hubs, category hubs, comparisons, and resource collections.
- feed.xml (RSS)
RSS feed of blog posts.
Need a different format or field subset? Open a GitHub issue describing the use case; we sometimes add purpose-built exports for substantive research requests.