Tepna doesn't ask you to trust a black box. Every claim has a live tool you can run and a short paper that says exactly how sure we are — and labels honestly whether it ran on real devices, real detectors, or simulation. Poke at the methods. That's the point.
Stating a device's error when you never owned a gold-standard reference — and surviving 20k patients without a crash.
You can't calibrate against a reference you don't own. So we put the O2Ring, Polar H10 and Verity Sense in a three-cornered hat — each one's error falls out of how the trio disagrees, with none assumed to be truth. The O2Ring's pulse reads true (−0.34 bpm bias); the noisy optical leg sets a shared floor.
The power analysis for the trick above. A Monte-Carlo sweep shows one ~1-hour window pins the whole trio to ±0.5 bpm. The twist: sit still and you understate the fast sensors, because resting strips the shared heart-rate variability the method leans on — you have to move.
Nothing crashes — zero fatals, 99.7% cross-node fusion overlap. But it quietly miscounts: the worse the apnea, the more breathing events the oximeter misses, tripping the severe-recall flag in 92.8% of severe patients. The dangerous failure is the silent, severity-localized one — not a crash.
A parking note, not a result. Single-channel waveform synthesis is a solved problem; the open frontier our harness exposes is timestamp pathology, cross-sensor temporal coherence, and provenance — most of which need no learned generator at all. Written down so it isn't rediscovered later.
When two pieces of code that share nothing read the same heartbeat, do they land on the same number? Sometimes — and the exceptions matter.
The chest-strap ECG recovers the beat train apnea-invariantly (99.8% either way). The optical wrist pulse fades exactly when perfusion drops — and the quality score stays green while it does, hiding the miss. That imperfect optical train inflates HRV by a median +16%.
Score the same heartbeats with three independent detectors. The two electrical ones (ECG and RR) are interchangeable — bias +0.02 ms, r = 0.9997, sharing no code or sampling rate. The optical one tells a different story (+32%): pulse-arrival jitter, not error. Report it separately; don't substitute.
The production oximetry detector recovers only about a quarter of scored respiratory events, and misses most in the sickest patients (≈−30 events/h in severe disease). The shipped AHI ≈ ODI-4 × 1.1 rule of thumb has a 15/h error; a re-fit correction halves it. We also work out exactly how big a real validation study would need to be.
Reliability, confounds, treatment effects and cross-organ coupling — built on synthetic cohorts run through the real detectors, with the limitation that bounds each claim stated up front.
Both, about equally and additively: rMSSD falls ≈4.2 ms per decade of age and ≈2.2 ms per 10 points of apnea. A naive single-metric HRV screen flags a lot of old-but-healthy people — about a quarter of its alarms. Adjusting for age fixes most of it.
Test–retest reliability of three metrics. Apnea index and HRV are stable personal traits from a single night (ICC 0.89 / 0.93). Daily glucose variability never settles down — it's a day-to-day state, not a trait, so no recording length makes it reliable as a fixed number.
A change-point detector on the nightly oxygen and heart-rhythm trajectory finds the planted therapy-start night with median error of zero nights. Fusing the respiratory and autonomic channels beats either alone — exact 97% of the time, within one night 99%.
Two sensors that share no code or input — CGM and RR→HRV — recover a coherent negative coupling: higher nocturnal glucose ↔ lower HRV. But it's not glucose hurting the heart. Partial out apnea burden and the link collapses: apnea is the shared driver of both.
The generators, runners and gates behind the studies above. All deterministic, all local — they live in the source repository (clone and open them locally). Heavier runs (20k patients, full-waveform lanes) stream to keep memory flat.
Drives up to 20,000 synthetic patients through the real detectors + fusion, resumable via IndexedDB.
The frozen-seed corpus engine — SpO₂, ECG, PPG, RR, CGM with planted pathology and timestamps.
Re-runs the cohort and diffs results against frozen baselines to catch drift.
The canonical regression gate — real modules, shared contract assertions, render-coverage.
Recomputes each bundle's build hash and audits committed exports for reproducibility.
Utility to split long Polar H10 ECG captures into the analyzer's part files.
Every study declares its evidence base — real data, real detector, simulation, or perspective — and states the single limitation that bounds its claim. Simulation pilots run on synthetic cohorts through the actual production code, so a contract break shows up as a wrong number, not a passing test. Each paper regenerates its own tables and figures from a named local tool. Drafts, not peer-reviewed, not for clinical use.