The Model Registry

Available models for Guildhall workflows, optimized for Blackthorn’s hardware (768GB RAM, 80 threads, 2× RTX 2080 8GB).

Current assignments (Quorum)

ModelSize (Q4)InferenceSeatsNotes
Qwen3-30B-A3B~18GBCPU1, 5, 8MoE, only 3B active. Fast for its capability.
Cogito 14B~9GBCPU2, 9Hybrid reasoning, deep thinking mode
Qwen3.5-9B~7GBGPU 13Fast instinctive responses
Qwen3.5-27B~17GBCPU4User-perspective reasoning, vision-capable
Mistral 7B Instruct~5GBGPU 26Punchy persuasive framing
Qwen3.5-35B~24GBCPU7MoE, creative/unpredictable outputs
Qwen3-8B~5GBCPU11Structured competitive analysis
DeepSeek R1 14B~9GBCPU10Systems/incentive reasoning
DeepSeek R1 32B~20GBCPU12Transparent reasoning for mandated dissent
Qwen3-32B~20GBCPU13Synthesis across all outputs

Total estimated RAM footprint (all models loaded): ~134GB Remaining RAM: ~634GB free

Model selection principles

  1. Capability match: Use the smallest model that handles the seat’s cognitive task well
  2. Diversity for tension: Seats that check each other’s work run different models
  3. Efficiency for agreement: Seats with orthogonal (non-adversarial) perspectives can share models
  4. GPU for latency: Only seats needing fast responses get GPU allocation
  5. MoE preference: Mixture-of-experts models (Qwen3-30B-A3B, Qwen3.5-35B) offer better capability-per-active-parameter for CPU inference

Models to evaluate

  • Llama 3.1 70B — alternative for Facilitator (Seat 13) if Qwen3-32B underperforms on synthesis
  • Gemma 4 — recently released, tool-calling native, worth testing for structured seats
  • GLM-5.1 — currently cloud-only via Ollama, local weights not yet available. Monitor for release.
  • Qwen3-Coder 30B — potential for any seats that need code analysis or technical evaluation

Lugh model assignments

TBD — Lugh’s pipeline stages have different requirements than Quorum’s seats. The Feynman tutor needs conversational depth. The research/synthesis stage needs factual grounding + RAG. The script generator needs narrative capability. These may share some Quorum models but will likely need their own assignments.