Validation Issues

Source

Issues identified during external validation review (2026-04-10). Each issue has a severity, status, and recommended resolution.

Issue 1: Meal Plan Generator Constraint Solver Complexity

Severity: High — blocks MVP meal plan quality Status: Researched — recommendation ready

The Meal Plan Generator spec describes a constraint satisfaction problem that gets combinatorially difficult with 10+ suspect foods across 5+ categories. Hard constraints (washout enforcement) combined with soft constraints (≥5 exposures, decorrelation, temporal distribution) produce a search space that a naive implementation won’t handle well.

Research complete: See Constraint Solver Research for full findings.

Recommendation: Use opt-rb gem with HiGHS backend for MVP. This is Andrew Kane’s solver-agnostic optimization DSL — clean Ruby API, MIT-licensed HiGHS solver, provably optimal MIP solutions, time-limit support. Upgrade path to or-tools CP-SAT if solution enumeration or anytime behavior is needed later. Both gems are from the same author.


Issue 2: Observation Independence Violation in 24h Trailing Window

Severity: Medium — doesn’t break MVP, needs v0.2 fix Status: Documented as known limitation

The 24-hour trailing window attributes a single meal to every symptom logged in the subsequent 24 hours. If a user logs bloating at 2pm, headache at 6pm, and fatigue at 10pm, that lunch generates 3 observations for the same exposure event.

Within a single symptom type this is fine (one exposure → one symptom score). But across types, the observations from the same meal are correlated — multi-symptom days cluster. This inflates effective n and can produce artificially narrow confidence intervals.

Impact at MVP scale (30 days, n=2): Minimal. The inflation is small relative to the noise floor. Impact at population scale: Needs addressing. The mixed effects model (v0.2) should include a meal-level random effect or cluster-robust standard errors.

No action needed for MVP. Document as a known limitation.


Issue 3: Missing Quantity/Dose Column in Schema

Severity: Low-Medium — schema change is cheap now, expensive later Status: Proposed fix — needs implementation

The meal_ingredients table has no amount field. A tablespoon of soy sauce and a bowl of miso soup register identically as binary glutamate exposure. The MVP correlator uses binary exposure (0/1), so this doesn’t affect current statistics — but the column should exist now to capture dose data for the v0.2 dose-response analysis.

Proposed schema addition:

Add to meal_ingredients:

ColumnTypeNotes
quantitydecimalnullable — amount consumed
unitstringnullable — “g”, “ml”, “tbsp”, “serving”, etc.
quantity_confidenceintegerenum: exact=0, estimated=1, unknown=2

quantity_confidence matters because most users will estimate. The correlator can use this as a secondary weight (exact measurements contribute more to dose-response analysis).

Action: Add to Schema before first migration.


Issue 4: Food Knowledge Graph Coverage Beyond Histamine

Severity: Medium — partially mitigated by existing SIGHI data Status: Researched — seeding strategy ready

The SIGHI dataset (already in Foods) covers histamine compatibility comprehensively but has limited coverage for salicylate, oxalate, FODMAP, lectin, glutamate, and capsaicin categories. The app’s 7-category system requires cross-category severity data that no single source provides.

Research complete: See Food Graph Sources Research for full findings.

Summary of sources:

  • Histamine → SIGHI ✅ (have this)
  • FODMAP → Monash University app (no API, manual extraction ~$9 + 4 hours)
  • Salicylate + Glutamate → RPAH Elimination Diet Handbook (~$30 + 6 hours)
  • Oxalate → Harvard oxalate table ✅ (free .xlsx download, import script)
  • Lectin → hand-curated from literature (~2 hours)
  • Capsaicin → Scoville references (~30 min)

Total estimated seeding effort for top 100 foods: ~15 hours + import scripts.


Issue 5: Regulatory Language Discipline

Severity: Low — already handled well in docs, needs enforcement in UI Status: Guidance documented

The docs consistently use “your data suggests” / “likely association” framing rather than diagnostic language. This discipline must extend to all user-facing copy. Specific guidelines:

  • Never say: “you have [condition]”, “you are sensitive to [food]”, “diagnosis”
  • Always say: “your data suggests”, “pattern detected”, “possible association”
  • Include the confidence classification (:significant / :suggestive / :insufficient_data) in every result
  • Add a persistent footer: “This is a self-tracking tool, not a medical device. Discuss results with your healthcare provider.”

No research needed. This is a copywriting constraint to enforce during UI build.


Issue 6: SIGHI Category Mapping Gap

Severity: Medium — the SIGHI mechanism codes don’t map 1:1 to the app’s 7 categories Status: Recommendation ready

The README maps SIGHI codes (H, H!, A, L, B) to histamine-related categories, but the app schema has 7 categories and SIGHI only directly addresses histamine pathway mechanisms. The README introduces biogenic_amines, histamine_liberator, and dao_inhibitor as separate categories — but the Sensitivity Categories doc defines only 7 top-level categories and doesn’t include these as distinct entries.

Decision needed: Are histamine liberators and DAO blockers sub-categories of histamine, or separate top-level categories? The biochemistry argues for sub-categories (they all ultimately affect histamine-mediated symptoms), but the quality scoring may need to distinguish them (a DAO blocker amplifies exogenous histamine but doesn’t directly trigger symptoms from histamine-free foods).

Recommendation: Keep 7 top-level categories. Add a mechanism enum to food_category_memberships: direct, liberator, blocker. The quality scorer and hypothesis engine can use this for finer-grained logic without proliferating top-level categories. See also Schema Considerations for the related FODMAP subcategory question.