Symptom Correlator — Design Specification

Purpose

The symptom correlator is the core statistical engine. Given a user’s meals, symptoms, and daily controls over a date range, it produces per-ingredient scores reflecting how strongly each ingredient is associated with symptom outcomes. Everything else in the app feeds into or reads from this service.

Inputs

  • A user
  • A date range (start_date, end_date)
  • An optional target FDR (default 0.10)

Output

An ordered collection of result entries, each containing:

  • The ingredient
  • A correlation score (positive = associated with worse symptoms)
  • The number of observations (exposure-symptom pairings)
  • Per-symptom-type scores (not a flattened average)
  • A raw p-value from the correlation test
  • An FDR-adjusted significance flag (true/false)
  • A confidence classification: :significant, :suggestive, :insufficient_data
  • The raw symptom scores and observation weights used (for transparency and debugging)

Spec 1: 24-Hour Trailing Attribution Window

What

When a symptom is logged, every meal eaten in the preceding 24 hours is a potential contributor. Attribution is based on the symptom’s timestamp and each meal’s eaten_at timestamp — not on calendar day boundaries.

Why

Calendar-day correlation creates an arbitrary boundary at midnight. A dinner at 7pm and a symptom at 9am the next morning are 14 hours apart — well within the reaction window for every sensitivity category — but a day-boundary model treats them as separate days. The 24-hour trailing window eliminates this artifact.

Behavior

For each symptom log, find all meals where symptom.logged_at - meal.eaten_at is between 0 and 24 hours. Meals eaten after the symptom are excluded. Meals eaten more than 24 hours before the symptom are excluded. Each meal-symptom pairing becomes one observation in the correlation.

Edge Cases

  • A symptom logged at 6am attributes to last night’s dinner (7pm, 11h ago) and yesterday’s lunch (12pm, 18h ago) but not the previous day’s dinner (31h ago — excluded).
  • A meal with no symptoms logged within the next 24 hours still counts as a “no symptom” observation — the absence of a symptom log is data. See Spec 8.
  • Multiple symptoms logged within hours of each other (bloating at 2pm and headache at 3pm) each independently query the 24h trailing window. The same meal can be attributed to both symptoms.

Spec 2: Time-Decay Weighting Within the Window

What

Within the 24-hour attribution window, meals eaten closer to the symptom contribute more to the correlation than meals eaten many hours earlier.

Why

Most food reactions have a peak onset window. Histamine reactions peak within 1-2 hours. Even delayed reactions (salicylates, lectins) have temporal profiles. A meal eaten 1 hour before a symptom is more likely to be causally related than one eaten 20 hours before.

Behavior

Each meal-symptom pairing receives a time-decay weight based on the elapsed hours between meal.eaten_at and symptom.logged_at. The default is linear decay from 1.0 at 0 hours to 0.2 at 24 hours:

decay_weight = 1.0 - (0.8 × hours_elapsed / 24.0)

This weight is multiplied into the observation’s contribution to the correlation, alongside the quality weight.

Future Consideration

Per-category decay profiles would be more accurate (histamine decays faster than lectin), but linear decay across all categories is an acceptable MVP simplification. The architecture should allow swapping the decay function without restructuring the correlator.


Spec 3: Quality Score as Observation Weight

What

The daily control quality score determines how much statistical influence an observation has on the correlation. It does NOT modify the symptom score itself.

Why

Multiplying the symptom score by a quality weight distorts the data. A day with poor sleep and a symptom score of 5 would become a score of 1.5 — making it look like symptoms were mild. The symptom was genuinely a 5. The question is whether that 5 was caused by food or by poor sleep. The correct approach is to reduce the observation’s influence on the fitted correlation while preserving the actual magnitude.

Behavior

The correlator computes a weighted correlation where:

  • x = ingredient exposure (binary: was the ingredient in the 24h window?)
  • y = raw symptom score (unmodified 1-5)
  • weight = quality_score × time_decay_weight

The weighted Pearson correlation formula uses these weights to determine each observation’s contribution to the covariance calculation. Higher-weight observations pull the fitted line more strongly.

What This Means in Practice

A high-symptom day with quality 0.2 (sick, bad sleep) contributes to the correlation but weakly. If that day is an outlier, it won’t dominate the result. A high-symptom day with quality 0.9 (good sleep, no confounders) contributes strongly — this is a day where the food signal is cleanest.


Spec 4: Per-Category Quality Scoring

What

The quality score for a given day varies depending on which sensitivity category is being evaluated. Antihistamine days have zero quality for histamine analysis but full quality for FODMAP analysis.

Why

Antihistamines block H1/H2 receptors, masking histamine-mediated symptoms. They have no effect on FODMAP fermentation, oxalate crystal deposition, or capsaicin TRPV1 signaling. Discounting an entire day because of an antihistamine throws away valid data for every non-histamine category.

Similarly, NSAIDs are salicylates and inhibit DAO — they confound salicylate and histamine analysis but not FODMAP or lectin analysis.

Behavior

The correlator requests a quality score per observation per category. When correlating ingredient X (which belongs to the histamine category) with symptoms, it uses the histamine-specific quality score. When correlating ingredient Y (FODMAP category) with symptoms, it uses the FODMAP-specific quality score.

For ingredients belonging to multiple categories, the correlator runs separate correlations per category context and reports the most relevant one (or all, for transparency).

See Daily Control Quality Score for the full category-specific scoring rules.


Spec 5: Minimum Exposure Threshold

What

An ingredient must appear in at least 5 meal-symptom observation pairings before the correlator reports a scored result for it.

Why

With fewer than 5 observations, the correlation is statistically meaningless. You can only detect very large effects (Cohen’s d > 1.5) with any confidence at that sample size. Reporting results from 2-3 observations gives users false precision and generates noise that obscures real signals.

Behavior

Ingredients with fewer than 5 pairings (after the 24h window attribution and quality filtering) are returned with confidence: :insufficient_data and no score. The user-facing output communicates this constructively: “Not enough data for [ingredient] yet — need [N] more meals with it to start seeing patterns.”


Spec 6: Benjamini-Hochberg FDR Correction

What

After computing correlations for all ingredient-symptom pairs, apply the Benjamini-Hochberg procedure to control the false discovery rate.

Why

With 7 sensitivity categories, 12 symptom types, and potentially dozens of ingredients, the correlator runs many parallel statistical tests. At α = 0.05, you’d expect ~1 in 20 to be a false positive by chance alone. With 84 tests, that’s ~4 spurious hits that would show up as “likely associations” when they’re actually noise. FDR correction adjusts for this.

Behavior

  1. Collect all p-values from all ingredient-symptom correlations that meet the minimum exposure threshold.
  2. Sort p-values ascending.
  3. For each p-value at rank k out of n total tests: compare to (k / n) × target_FDR.
  4. The largest p-value that meets its threshold, and all smaller p-values, are marked as significant.

Target FDR default: 0.10 (10% false discovery rate). More permissive than clinical research (0.05) but appropriate for an exploratory consumer tool. The user sees “likely” and “possible” associations, not clinical diagnoses.

Output Classification

  • fdr_significant: true + strong effect size → :significant — “Your data suggests a likely association”
  • fdr_significant: false + moderate correlation → :suggestive — “There might be a pattern here, but more data is needed”
  • Below minimum exposure → :insufficient_data — “Not enough observations yet”

Spec 7: Per-Symptom-Type Resolution

What

The correlator maintains separate correlation scores per symptom type, not a single flattened average across all symptoms.

Why

A food that causes bloating (score 5) but no headache (score 1) should show that distinction. Flattening to an average of 3 loses the signal. Different sensitivity categories also have different symptom signatures — histamine is systemic (skin, respiratory, cardiovascular), FODMAPs are predominantly gut. Per-symptom resolution enables the hypothesis engine to match symptom profiles against expected category signatures.

Behavior

The output includes a symptom_scores hash keyed by symptom_type, with each entry containing the category-specific weighted correlation for that ingredient-symptom pair.

The top-level score is the maximum across symptom types (the strongest signal), not the average. This prevents a strong gut signal from being diluted by unaffected systemic scores.


Spec 8: Handling Meals with No Subsequent Symptoms

What

A meal followed by no symptom logs within 24 hours is a valid observation — it’s a data point where the ingredient was present and symptoms were absent (or below the user’s logging threshold).

Why

Only counting meal-symptom pairings where symptoms are logged creates selection bias. People are more likely to log symptoms when they feel bad. Ignoring “ate the food and felt fine” observations inflates positive correlations and makes every tested food look like a trigger.

Behavior

If a meal has no symptom logs in the 24 hours following it, the correlator creates an implicit observation with symptom score = 0 for all symptom types. This observation receives the standard quality weight for that day and counts toward the correlation.

This means the correlator runs against the full set of meals, not just meals that happened to precede symptom logs.


Dependencies

Consumers

  • Hypothesis Engine — consumes correlator output to generate suggestions
  • User-facing reports — the plain-language “you seem to react to X” output
  • Total Mediator Load — the biochemical model this service implements statistically