Open lab · working preprints

The experiments behind every number.

Tepna doesn't ask you to trust a black box. Every claim has a live tool you can run and a short paper that says exactly how sure we are — and labels honestly whether it ran on real devices, real detectors, or simulation. Poke at the methods. That's the point.

real data real detector simulation perspective
11working preprints
10runnable tools
20,000synthetic patients stress-run
100%on-device · reproducible
01 / METROLOGY

How much can you trust a cheap sensor?

Stating a device's error when you never owned a gold-standard reference — and surviving 20k patients without a crash.

real datareal detectorspilot complete

Three sensors, no gold standard — let them check each other

You can't calibrate against a reference you don't own. So we put the O2Ring, Polar H10 and Verity Sense in a three-cornered hat — each one's error falls out of how the trio disagrees, with none assumed to be truth. The O2Ring's pulse reads true (−0.34 bpm bias); the noisy optical leg sets a shared floor.

1.7 / 2.2 / 6.2reference-free σ in bpm — H10 / O2Ring / Verity, from a real ~2-hour co-recording
simulationreal detectorspilot complete

How long must you wear all three to know how good each one is?

The power analysis for the trick above. A Monte-Carlo sweep shows one ~1-hour window pins the whole trio to ±0.5 bpm. The twist: sit still and you understate the fast sensors, because resting strips the shared heart-rate variability the method leans on — you have to move.

~1 hourof co-recording pins all three devices to ±0.5 bpm
simulationreal detectorsbenchmark complete

20,000 synthetic patients vs the real pipeline. What breaks?

Nothing crashes — zero fatals, 99.7% cross-node fusion overlap. But it quietly miscounts: the worse the apnea, the more breathing events the oximeter misses, tripping the severe-recall flag in 92.8% of severe patients. The dangerous failure is the silent, severity-localized one — not a crash.

0 crashesacross 20,000 patients · the one real failure is a quiet, predictable under-count
perspectiveparked note

The hard part of fake physiology isn't the waveform

A parking note, not a result. Single-channel waveform synthesis is a solved problem; the open frontier our harness exposes is timestamp pathology, cross-sensor temporal coherence, and provenance — most of which need no learned generator at all. Written down so it isn't rediscovered later.

agendanames the narrowest first paper: a deterministic timestamp-pathology benchmark
02 / SIGNAL FIDELITY

Do independent detectors agree?

When two pieces of code that share nothing read the same heartbeat, do they land on the same number? Sometimes — and the exceptions matter.

simulationreal detectorspilot complete

When breathing stops, which signal keeps its beat?

The chest-strap ECG recovers the beat train apnea-invariantly (99.8% either way). The optical wrist pulse fades exactly when perfusion drops — and the quality score stays green while it does, hiding the miss. That imperfect optical train inflates HRV by a median +16%.

99.8% vs 96.4%beat recovery in apnea — ECG holds, optical pulse dips
simulationreal detectorspilot complete

Three detectors, one tachogram — who agrees on HRV?

Score the same heartbeats with three independent detectors. The two electrical ones (ECG and RR) are interchangeable — bias +0.02 ms, r = 0.9997, sharing no code or sampling rate. The optical one tells a different story (+32%): pulse-arrival jitter, not error. Report it separately; don't substitute.

r = 0.9997ECG ≡ RR · optical PRV diverges +12.6 ms (+32%)
real detectorn=5 real nights+ power analysispilot → needs PSG

A $150 finger ring vs the sleep lab: it under-counts severe apnea

The production oximetry detector recovers only about a quarter of scored respiratory events, and misses most in the sickest patients (≈−30 events/h in severe disease). The shipped AHI ≈ ODI-4 × 1.1 rule of thumb has a 15/h error; a re-fit correction halves it. We also work out exactly how big a real validation study would need to be.

~150–300paired PSG nights needed to publish this for real — bound by the severe stratum
03 / COHORT & PHYSIOLOGY

What the signals say about people.

Reliability, confounds, treatment effects and cross-organ coupling — built on synthetic cohorts run through the real detectors, with the limitation that bounds each claim stated up front.

simulationreal detectorpilot complete

Low HRV — is it the apnea, or just getting older?

Both, about equally and additively: rMSSD falls ≈4.2 ms per decade of age and ≈2.2 ms per 10 points of apnea. A naive single-metric HRV screen flags a lot of old-but-healthy people — about a quarter of its alarms. Adjusting for age fixes most of it.

0.69 → 0.78screening AUC for moderate+ apnea, before → after age-adjustment
simulationreal detectorpilot complete

How many nights before a number is really you?

Test–retest reliability of three metrics. Apnea index and HRV are stable personal traits from a single night (ICC 0.89 / 0.93). Daily glucose variability never settles down — it's a day-to-day state, not a trait, so no recording length makes it reliable as a fixed number.

1 nightis enough for ODI-4 & rMSSD · CGM-CV stays a daily state (ICC ≈ 0)
simulationreal detectorpilot complete

Can the data alone spot the night you started CPAP?

A change-point detector on the nightly oxygen and heart-rhythm trajectory finds the planted therapy-start night with median error of zero nights. Fusing the respiratory and autonomic channels beats either alone — exact 97% of the time, within one night 99%.

AUC 0.99detecting a real treatment response vs flat-arc controls
simulationreal detectorpilot complete

Does overnight glucose move with your HRV?

Two sensors that share no code or input — CGM and RR→HRV — recover a coherent negative coupling: higher nocturnal glucose ↔ lower HRV. But it's not glucose hurting the heart. Partial out apnea burden and the link collapses: apnea is the shared driver of both.

18,741co-generated nights · coupling r −0.23, vanishes when apnea is controlled
The lab bench

The harness everything runs on.

The generators, runners and gates behind the studies above. All deterministic, all local — they live in the source repository (clone and open them locally). Heavier runs (20k patients, full-waveform lanes) stream to keep memory flat.

Cohort Runner cohort-runner.html

Drives up to 20,000 synthetic patients through the real detectors + fusion, resumable via IndexedDB.

Synthetic Generator synth-gen.html

The frozen-seed corpus engine — SpO₂, ECG, PPG, RR, CGM with planted pathology and timestamps.

Cohort Regression cohort-regression.html

Re-runs the cohort and diffs results against frozen baselines to catch drift.

Dex Test Suite Dex-Test-Suite.html

The canonical regression gate — real modules, shared contract assertions, render-coverage.

Provenance Verifier verify-provenance.html

Recomputes each bundle's build hash and audits committed exports for reproducibility.

ECG Splitter ECG Splitter.html

Utility to split long Polar H10 ECG captures into the analyzer's part files.

How to read these

Honest labels, or it doesn't ship.

Every study declares its evidence base — real data, real detector, simulation, or perspective — and states the single limitation that bounds its claim. Simulation pilots run on synthetic cohorts through the actual production code, so a contract break shows up as a wrong number, not a passing test. Each paper regenerates its own tables and figures from a named local tool. Drafts, not peer-reviewed, not for clinical use.

T © 2026 Michal Planicka ·Tepna v1.0.0 ·Apache-2.0 ·◈ Asheville, NC ·not a medical device