Topic Organization & Pre-packaging

The distribution insight

The most expensive part of Lugh’s pipeline is Stage 0 — topic discovery and curriculum design. Every user who types “Understanding Organic Chemistry” triggers the same breadth-first research and syllabus generation. Pre-computing this work and shipping it with the app transforms the first-run experience from “wait while I research this” to “pick a topic and start.”

Dewey Decimal as the organizing taxonomy

The Dewey Decimal Classification provides a ready-made hierarchical taxonomy of all human knowledge. It has three levels of increasing specificity:

  • 10 main classes (000-900): Computer Science, Philosophy, Religion, Social Sciences, Language, Science, Technology, Arts, Literature, History
  • 100 divisions: Physics (530), Chemistry (540), Biology (570), Economics (330), etc.
  • 1,000 sections: Electricity (537), Organic Chemistry (547), Genetics (576), etc.

The third level (sections) maps almost exactly to the right grain size for a Lugh course. “Understanding Electricity” and “Understanding Organic Chemistry” are coherent 8-12 episode podcast series. The second level is too broad (a degree), and deeper levels are too narrow (an episode).

Not all 1,000 sections are viable Lugh topics. Filtering out classification-only categories (library operations, geographic subdivisions, literature-by-language categories), roughly 400-600 sections represent learnable subjects.

Non-Dewey topics

Dewey covers academic knowledge well but doesn’t capture everything people want to learn. Real learning needs that fall outside the taxonomy include:

  • Health conditions: Understanding MCAS, Understanding ADHD, Understanding Type 1 Diabetes
  • Practical skills: Understanding Your Home Electrical System, Understanding 3D Printing
  • Hobby/creative: Understanding D&D Encounter Design, Understanding Sourdough
  • Technology-specific: Understanding Kubernetes, Understanding Rails, Understanding LLMs
  • Personal/professional: Understanding Project Management, Understanding Negotiation

These add roughly 200-400 topics, bringing the total universe to approximately 700-1,000 distinct topics.

What ships with the app

Not all 700-1,000 topics need to be pre-built for v1. Power law distributions in search interest mean a relatively small number of topics cover most first queries.

v1 target: 200-300 pre-built topics covering:

  • The most popular Dewey sections across all ten main classes
  • High-demand non-Dewey topics identified by cross-referencing multiple signals

Each pre-built topic includes syllabi at all three Learning Depths (Awareness, Functional Literacy, Deep Understanding), so 600-900 syllabi total.

The remaining topics generate on the fly via Stage 0. The user waits longer for their first episode, but the pipeline works identically.

Identifying the top topics

Multiple data sources can be triangulated to identify which topics have the highest demand:

Academic demand

  • Open Syllabus Project — 20+ million university syllabi from 80+ countries, with 77 million extracted citations. Their Galaxy tool visualizes the most-assigned texts by field. This reveals what’s most commonly taught.
  • MIT OpenCourseWare — full syllabi with readings, freely available. Course catalog shows what MIT considers worth teaching.
  • For Dummies catalog — 2,500+ titles represent a market-validated list of “things non-experts want to understand.” The catalog is essentially the popular-demand counterpart to Open Syllabus.
  • Khan Academy / Coursera / edX — most-enrolled courses show what self-learners actually choose.

Current interest

  • Google Trends — queries like “how does X work” and “what is X” show current search interest.
  • Wikipedia most-viewed articles — available via their API, shows what people are actively looking up.

A topic that appears in Open Syllabus (academic demand), For Dummies (popular demand), AND Google Trends (current interest) is a strong candidate for pre-packaging. Something like “Understanding Economics” hits all three. “Understanding Iguanas” hits one.

Dewey as a browsing UI

Beyond data organization, Dewey provides a browsing taxonomy for the app’s interface. Instead of requiring the user to type a topic, they can browse:

Science → Life Sciences → Genetics → “Understanding Genetics”

This solves the “I want to learn something but I don’t know what” problem. The user explores the knowledge tree and discovers topics they didn’t know they were interested in.

The hierarchy also enables related-topic suggestions: “You completed Understanding Genetics. You might be interested in Understanding Evolutionary Biology (576) or Understanding Molecular Biology (572).”

Community contribution (v2+)

Once the app is in use, users generate syllabi for topics not in the pre-built set. A contribution mechanism could let users submit their generated syllabi back to a shared repository:

  • User generates “Understanding Beekeeping” locally
  • Reviews it, confirms it’s good
  • Submits the syllabus (not the personalized episodes) to a community index
  • The next person who types “Understanding Beekeeping” gets the pre-built syllabus

The syllabus becomes a community artifact. The personalized episodes remain individual.

The math

  • ~700-1,000 total viable topics
  • 3 depth levels each = 2,100-3,000 syllabi
  • Pre-build 200-300 topics for v1 = 600-900 syllabi
  • Batch-generatable as a one-time project using API calls during development
  • Ships as static data with the app, no runtime cost for popular topics