Overview & Architecture
EMA Forge is an open-source toolkit for building, deploying, and analyzing Ecological Momentary Assessment (EMA) and digital-phenotyping studies. It is end-to-end serverless: your study is compiled into a static web app, participants complete it in their mobile browser, and response data is generated, stored, and downloaded entirely on the participant's device.
There are three things you'll interact with as a researcher, and one thing your participants will interact with:
You do not host the Builder yourself. Go to emaforge.keeganwhitacre.com, design your study there, click Export, and host the resulting bundle for participants. The Builder runs entirely in your browser — nothing you type gets sent to the server — so using the hosted version does not compromise privacy in any way.
No user account, no database, no backend. The entire study lifecycle — build → export → host → distribute → collect → analyze — runs without a server component maintained by you.
Why Serverless?
Traditional EMA platforms require standing up a backend server, authenticating users, and handling data-in-transit through a database that becomes a named component of every IRB protocol. The tradeoffs EMA Forge makes to avoid that:
- Zero backend infrastructure. Host on GitHub Pages, Netlify, Vercel, Cloudflare Pages, or any institutional static web server. No runtime, no DB, no SSL cert to renew.
- Data stays local until the participant hands it over. Response data is written to the participant's device as a file. You collect it via whatever channel your IRB has already approved for data return (secure email, REDCap upload, institutional SFTP, etc.).
- Offline-tolerant. Once the study app is loaded in the browser, it keeps functioning without network. Completed sessions queue locally and download when the participant taps "Save."
- Full transparency. Every line of runtime code is visible in the exported HTML. Reviewers, participants, and IRBs can inspect exactly what runs on a participant's device.
Tradeoffs to be honest about (see Known Limitations):
- There is no central "dashboard" of live enrollment — data is only visible once files are returned to the researcher.
- EMA Forge does not send notifications itself. You need an external scheduler/SMS service to deliver links at the right time. The beta Twilio dispatcher ships this gap (see Twilio Integration).
- The static host can see request metadata (IP, timestamp of page load). This is the same constraint as any web page — it's not a data transmission, but it's not nothing.
Feature Status Matrix
Keep this in front of you when planning a study. Anything marked BETA has shipped but hasn't yet been through external pilot validation; anything marked WIP or PLANNED is not yet usable.
| Feature | Status | Notes |
|---|---|---|
| Study Builder (questions, schedule, theme) | STABLE | Schema v1.5.0. |
| Single-file & static-bundle export | STABLE | Both include config.json. |
| Onboarding / consent flow | STABLE | Rich-text consent, progress bar. |
| EMA question types (slider, choice, multi, text, number) | STABLE | See Questions Tab. |
| Affect Grid (valence × arousal) | STABLE | Stored as {valence, arousal} in [−1, 1]. |
| Skip logic (compound AND/OR) | STABLE | Rules on any prior non-multi question. |
| Phase sequencing (multi-task windows) | STABLE | Arbitrary ordered EMA/Task/HR steps. |
| Conditional tasks (e.g. run ePAT only if HR > 80) | STABLE | See Conditional Tasks. |
| Session locking + crash recovery | STABLE | LocalStorage per-phase resume. |
| Dashboard (compliance, watchlist, CSV export) | STABLE | Runs locally in browser. |
| Heart-rate capture question type (PPG) | BETA | Requires rear camera + torch. iOS works; desktop no. |
| ePAT task (heartbeat perception) | BETA | Shares PPG core with HR capture. |
| Response latency in Dashboard | WIP | Needs external prompt-delivery timestamps — see Sending Prompts. |
| Twilio SMS dispatcher (Apps Script) | BETA | Generated from the Deployment tab; see Twilio Integration. Awaiting external pilot. |
| Webhook auto-upload on session complete | STABLE | Google Apps Script tested end-to-end; see Auto-upload with Webhooks. |
| IAT (Implicit Association Task) | STABLE | Mobile-optimized tap response, D-score computed in-browser. No camera required. |
| Stroop / additional task modules | PLANNED | The task system is built to be extended. |
System Requirements
For you (researcher)
- Any modern desktop browser. Chrome, Edge, Firefox, or Safari (from 2022 onward). The Builder and Dashboard run entirely in your browser at emaforge.keeganwhitacre.com.
- A way to host your exported study. GitHub Pages is the path of least resistance and free for researchers. Any institutional static-file server works too. (See Hosting Your Study — it is genuinely ten clicks.)
- A way to send participants links at the right times — an SMS service, institutional email scheduler, Twilio, or similar. EMA Forge generates the links; something else delivers them.
- R or Python (optional). The Dashboard handles most day-to-day compliance monitoring in the browser. You'll want R or Python for real analysis.
For participants
- iOS Safari 14.5+ or Chrome for Android 90+ (roughly anything from 2021 onward).
- For the HR / ePAT modules only: a rear-facing camera with a controllable torch (flashlight). All modern iPhones and most Android devices qualify; tablets without rear cameras do not. Desktop browsers will refuse to launch these modules.
- A working browser link. That's it — no app install, no account.
iOS quirk worth flagging during onboarding: Camera/torch access on iOS requires the page to be launched from Safari (not from inside the Messages app preview or from an SMS link-preview bubble). Your participant instructions should explicitly say "tap the link to open it in Safari" for studies using HR or ePAT.
Quick Start — 5 minutes to a working pilot
You do not need to install anything. The Builder and Dashboard are hosted and ready to use.
- Go to emaforge.keeganwhitacre.com and click Builder.
- In the Study tab, name your study.
- In Questions, click
+ Sliderand add one mood item. Keep the default anchors. - In Schedule, set study length to
3days and keep the default morning/afternoon/evening windows. - Click Export → Single HTML file. You'll get a file you can email to yourself.
- Open that file on your phone with
?id=1&day=1&session=w1appended. You're running a study. - When you finish the session, your phone downloads a CSV. Go back to emaforge.keeganwhitacre.com, open the Dashboard, import the folder, and you'll see your own compliance data.
That round trip tells you whether your device can open sessions and whether your data is landing where you expect. Always do this before enrolling a participant.
Please use the hosted version at emaforge.keeganwhitacre.com rather than cloning and self-hosting the Builder. The hosted site is actively maintained and always on the current schema version. Everything happens in your browser — nothing you type into the Builder is transmitted to the server, so "hosted" here just means "the files are served from one canonical place." You'll still self-host the exported study (see Hosting Your Study), because that's the file participants actually open.
Study Tab
Global study configuration. Set here:
- Study name & institution — appears in participant app title and CSV metadata.
- Theme & accent color — OLED (pure black, recommended for battery), dark, or light. Accent color inherits into the participant app.
- Response format — CSV (one row per answered question) or JSON (full structured payload). Both are downloaded; JSON is always included when a session contains high-density signal data (PPG/ePAT).
- Webhook URL (optional) — if set, session data is auto-uploaded to this URL instead of being manually downloaded by the participant. See Auto-upload with Webhooks for setup.
- Greetings — per-window header text ("Good Morning", "Check-In", etc.).
- Completion lock & crash recovery — see Runtime Behavior.
Onboarding & Consent
The Onboarding flow runs exactly once per participant, on their first link
(?id=N&session=onboarding, Day 0). It covers:
- A progress-barred multi-screen intro.
- A rich-text consent screen that requires scroll-to-bottom before the "I agree" checkbox activates. Consent text accepts HTML so you can paste in your IRB-approved language with headings.
- An optional schedule-sanity screen letting the participant confirm they'll be available during each window.
- For studies with ePAT enabled: a two-phase practice (tone-to-tone, then tone-to-heartbeat) before the real trials start.
If onboarding is disabled, Day 0 is skipped entirely and participants land directly in their first EMA window.
Questions Tab
The Questions tab is where you write your survey items. Each question appears as a card — click to expand it, drag to reorder. Changes show up live in the phone preview as you type.
Question types
| Type | Stored as | Notes |
|---|---|---|
slider |
number | Min, max, step, unit, two anchor labels. |
choice |
string | Single-select from an option list. |
checkbox |
array of strings | Multi-select. Serialized as a;b;c in CSV. |
text |
string | Free-form text input. |
numeric |
number | Number input with keypad on mobile. |
affect_grid |
{valence, arousal} |
2D tap target. Both axes in [−1, 1]. Serialized as valence;arousal in CSV. |
heart_rate |
{bpm, sqi, ibi_series} |
Camera PPG for a configurable duration. The BPM number is what participates in skip logic and conditional tasks. The "Show BPM to participant" toggle controls whether the number is rendered on screen - turn it off before HCT or any other task taht could be biased by knowing the captured HR. See HR Capture. |
page_break |
— | Splits questions onto separate screens. Not a question per se. |
Per-question settings
- Show In (Phase): whether this question appears pre-task, post-task, or both within a phase-sequenced window.
- Active Sessions: restrict the question to specific time windows (e.g. only morning).
- Required: blocks advancement until answered.
- Skip Logic: compound AND/OR conditions on any prior question. Operators:
eq, neq, gt, gte, lt, lte, includes. When a condition evaluates false, the question is silently skipped and not recorded. - Response piping: insert
{{q_id}}in a later question's text to substitute the participant's prior answer. Useful for contextualizing follow-ups ("You said your mood was{{q1}}. What was the cause?").
Skip logic semantics — edge cases worth knowing
Conditions reference the raw response value, which means:
- For sliders, comparisons are numeric.
q1 gt 70works as expected. - For single-choice, comparisons are string equality against the option label (not index).
- For checkbox (multi-select), comparisons use
includes.eqon a multi-select will nearly always be false because the stored value is an array. - For Affect Grid, skip-logic is disabled in the UI (the value is a compound object, not a scalar).
- For Heart Rate, comparisons hit the
bpmfield directly, soq_hr_1 gt 80does what you'd expect. - If a question referenced by a condition was itself skipped or not-yet-presented, the rule evaluates false. Design accordingly.
Schedule & Phase Sequencing
A study consists of N days, each day containing K time windows, each window containing an ordered phase sequence of steps.
Days & windows
- Study length: 1–365 days.
- Days of week: restrict to weekdays only, etc.
- Windows: each has an ID, a display label, and a start/end time. The ID (e.g.
morning) is what becomes thesession=URL parameter.
Phase sequence
Inside a window, you build an ordered list of steps. A step is one of:
| Step kind | Meaning |
|---|---|
ema (pre) |
The pre-task EMA block: all questions whose "Show In" = Pre or Both. |
ema (post) |
The post-task EMA block: all questions whose "Show In" = Post or Both. |
task |
A task module (currently: epat). Optionally gated by a condition. |
hr |
A standalone heart-rate capture step with configurable duration. |
Steps run top-to-bottom. The same task can appear multiple times; you can have Pre-EMA → HR → ePAT → Post-EMA, or two tasks sandwiched between three EMA blocks — whatever the protocol calls for.
Response window timing: the expiry_minutes setting is enforced at runtime
by comparing against a t= URL parameter (a millisecond timestamp added by your
SMS/scheduler). If t is absent, links never expire — convenient for piloting, but turn this
on before real enrollment.
Tasks Tab
The Tasks tab is where you turn optional task modules on or off. Toggling a task makes it available to drop into your schedule — it doesn't insert it anywhere automatically; you still choose where in the session sequence it runs (see Schedule).
Currently available: ePAT, HCT, and IAT. No camera is required for the IAT — it is the only task module that works on any smartphone without hardware constraints.
Live Preview
The right-hand iframe in the Builder is a fully functional participant app, re-stitched on every edit. It
runs in preview mode (__PREVIEW_MODE__ = true), which disables link expiry and
session locking so you can walk through the same session repeatedly. Clicking Reset clears
the preview's local storage without touching your real project.
Preview is the fastest way to catch issues before you export anything. Touch-test it on your own phone by opening the Builder's URL on your device — the preview iframe works over any local network you're on.
Export Options
| Option | What you get | When to use |
|---|---|---|
| Single HTML file | One self-contained .html with config, JS, and CSS inlined. |
Pilots, QA, or emailing a direct file to one participant. Config is baked in — any edit means re-exporting. |
| Static-hosting bundle RECOMMENDED | A .zip containing index.html, config.json, css/,
and js/. |
Real deployment. Because config.json is separate, you can fix a typo in a
question without re-exporting. Just edit the JSON on the host. |
Hosting Your Study
This section is about hosting the exported study bundle — the thing participants will open. You are not hosting the Builder itself (that already lives at emaforge.keeganwhitacre.com).
The simplest option for most academic labs is GitHub Pages. It is free, reliable, and requires no command-line work. The minimal path:
- Create a free GitHub account if you don't have one.
- Create a new repository (e.g.
my-ema-study). Make it public or private — both work with GitHub Pages on free accounts. - Click Add file → Upload files, and drop in the entire contents of your extracted Export
zip (the
index.html,config.json, and thecss/jsfolders). - Go to Settings → Pages. Under "Source," choose main branch, root folder.
- Wait a minute. Your study is now live at
https://<your-username>.github.io/my-ema-study/.
That URL is the base you give to the Builder's Deployment tab to generate participant links. You're done.
Prefer the command line, or deploying via another host?
Any static host works unchanged — Netlify, Vercel, Cloudflare Pages, S3 + CloudFront, or an Apache/Nginx directory on institutional infrastructure. No build step, no server runtime.
The git-native GitHub Pages recipe:
# From your extracted bundle folder
git init
git add .
git commit -m "study v1"
git branch -M main
git remote add origin https://github.com/your-lab/your-study.git
git push -u origin main
# Then in GitHub: Settings → Pages → Source = main branch, root
# Your study is live at https://your-lab.github.io/your-study/
Participant Routing
Because there is no backend, the participant's session is entirely determined by the URL they open. Four query parameters drive this:
| Parameter | Required | Description |
|---|---|---|
id |
yes | Participant identifier (any string, usually numeric). |
day |
yes (except onboarding) | Study day, typically 1–N. |
session |
yes | Window ID defined in the Schedule tab, or the literal string onboarding for Day 0. |
t |
recommended | Millisecond Unix timestamp of when the link was sent. Used for expiry enforcement. Your scheduler should inject this at send-time. |
force |
no | When set to 1, overrides session locking. For researcher use only (QA, rescues). Do not
put this in participant links. |
https://your-lab.github.io/study/?id=104&day=2&session=morning&t=1732812000000
Generating Links in Bulk
The Builder's Deployment tab takes a base URL and a participant-ID range, and emits a CSV
with one row per (participant × day × session). Each row includes a Phase_Sequence column so
you can eyeball at a glance what each link will do.
Participant_ID,Day,Session,Phase_Sequence,URL
104,0,Setup,Onboarding,https://.../?id=104&session=onboarding
104,1,Morning,Pre-EMA → ePAT → Post-EMA,https://.../?id=104&day=1&session=w1
104,1,Evening,Pre-EMA,https://.../?id=104&day=1&session=w2
...
This CSV is the glue between EMA Forge and your delivery mechanism of choice — drop it into a Qualtrics
contacts list, a Twilio scheduled-SMS campaign, an institutional email merge, or a lab-built scheduler. Your
scheduler is responsible for appending a t= timestamp at send-time if you want expiry
enforcement.
Sending Prompts
You have two paths. Both use the same routing CSV as their source of truth.
- Bring your own mechanism. Drop the routing CSV into an institutional email merge, a
Qualtrics contacts list, a Power Automate flow, or any SMS tool that can schedule messages from a CSV. You
handle timing and
t=injection. - Export the Twilio dispatcher. BETA Generates an
Apps Script that runs on a time trigger in a Google Sheet, handles timezone math, dedupes sends, and
enforces
t=expiry. See Twilio Integration for full setup.
Session Lifecycle
When a participant opens a link, the runtime:
- Parses URL parameters.
- Checks expiry (if
tandexpiry_minutesare both set). - Checks the completion lock — has this
(id, day, session)tuple already been submitted on this device? - Checks for an in-progress resume state for this session.
- Renders the first step of the session (usually a pre-task EMA block).
- On each phase completion, writes the phase's responses to local storage and advances.
- When the last phase finishes, the full session payload is assembled and the "Save Local Copy" screen appears. Tapping it triggers a file download.
Expiry & Grace Period
- Expiry window (default 60 min): after this many minutes past
t, the link renders a "Link Expired" screen and the Start button is disabled. Recorded as a missed ping by the Dashboard. - Grace period: a secondary buffer primarily used for downstream compliance calculations; the link remains functional during grace, but responses inside the grace window can be flagged during analysis.
Session Locking
By default, a participant who completes a session and then re-clicks the same link sees a "Session Complete — thanks, come back at the next prompt" screen. This prevents accidental double-submissions.
Researchers can override with ?force=1 appended to the URL (useful for QA walkthroughs or for
rescuing a participant whose submission didn't land). Do not include force in
participant-facing links.
Crash Recovery
If a participant's browser crashes mid-session (tab killed by iOS memory pressure, phone restart, app switch past timeout), reopening the same link will:
- Restore all completed phases (pre-EMA, finished task trials, etc.) — that data survives.
- Restart the current phase from the beginning. Partial data within the phase is discarded to avoid ambiguity.
This is intentionally conservative. A half-finished phase with an unknown number of missing questions is worse than a clean re-run.
Schema Versioning & Migrations
Every study carries a schema version (currently 1.5.0). When the Builder
loads a saved study — whether from your browser's local storage or a backup file you imported — it checks
that version:
- Equal → load as-is.
- Older → forward-migrate silently (new fields get sensible defaults).
- Newer → refuse to load. A newer schema might include fields this runtime doesn't know how to preserve; loading would silently corrupt them. Export a backup from the newer Builder or reset.
This is worth knowing when rolling out an EMA Forge update mid-study: finish in-progress waves on the old version, then upgrade.
What Happens at Session End
When the last phase completes, one of two things happens depending on whether you've configured a webhook (Study tab → Webhook URL):
Manual return (default)
The participant sees a "Save Local Copy" button. Tapping it downloads one or two files to their device:
ema_data_[ID]_[DAY]_[SESSION].csv— long-format, one row per question answered. Default output when the session contains only standard EMA responses.ema_data_[ID]_[DAY]_[SESSION].json— full structured payload, included automatically whenever the session contains high-density signal data (PPG samples, ePAT trial-level data). Written alongside the CSV, not instead of it.
The participant returns these files via whatever channel your IRB approved (secure email, REDCap file upload, institutional SFTP). The Dashboard expects a folder of JSON files for analysis; a CSV-only workflow is also viable for simpler studies.
Webhook return (opt-in)
If a Webhook URL is set, the same data POSTs silently to your endpoint and the participant sees "✓ Data uploaded successfully" instead of a download button. Setup, payload shape, and concrete Google Sheets / Microsoft recipes are in Auto-upload with Webhooks. If the POST fails (offline participant, endpoint down), the "Save Local Copy" button automatically reappears — data is never lost silently.
CSV Schema (Dashboard export)
When you use the Export Master CSV button in the Dashboard, you get one row per
(session, question) pair in long format. The columns:
| Column | Description |
|---|---|
participant_id |
From the URL ?id=. |
day |
From the URL ?day=. |
session_id |
Unique per-session identifier generated by the runtime. |
window_id |
Matches the Schedule-tab window ID (e.g. morning). |
window_label |
Human-readable window label. |
block |
pre, post, or blank. Which EMA block within the phase sequence the
question belonged to. |
session_started_at |
ISO-8601 timestamp of session start. |
session_submitted_at |
ISO-8601 timestamp of final session save. |
phase_started_at |
ISO-8601 timestamp — when the block containing this question began. |
phase_submitted_at |
ISO-8601 timestamp — when the block was submitted. |
question_id |
Stable ID (e.g. q1, q_hr_1). |
question_text |
The question text at the time of export. |
question_type |
One of the types listed in Questions Tab. |
presentation_order |
1-indexed order the question was shown in, accounting for skip logic. |
response_value |
The raw response, serialized (checkbox as a;b;c, Affect Grid as
valence;arousal).
|
response_numeric |
Numeric cast of the response if possible; blank otherwise. Convenient for sliders. |
response_latency_ms |
Milliseconds from phase start to this response. |
JSON Schema (per-session)
The raw JSON payload is what the Dashboard reads and is the source of truth. Shape (abbreviated):
{
"participantId": "104",
"sessionId": "s_a1b2c3",
"day": 2,
"type": "morning",
"status": "complete",
"startedAt": "2026-04-22T08:03:11Z",
"completedAt": "2026-04-22T08:06:48Z",
"data": [
{
"type": "ema_response",
"block": "pre",
"windowId": "w1",
"startedAt": "...",
"submittedAt": "...",
"presentationOrder": [["q1", "q2"], ["q3"]],
"responses": {
"q1": { "value": 72, "respondedAt": "..." },
"q2": { "value": "Working", "respondedAt": "..." },
"q3": { "value": {"valence": 0.4,
"arousal": -0.2}, "respondedAt": "..." },
"q_hr_1":{ "value": { "bpm": 74,
"sqi": 0.82,
"ibi_series": [810, 795, ...] },
"respondedAt": "..." }
}
},
{
"type": "epat_response",
"trials": [ { "trial": 1, "phase_ms": 342, "confidence": 3, "sqi": 0.91 }, ... ],
"summary": { "valid_trials": 18, "mean_abs_phase_ms": 218, "..." }
}
]
}
Analysis in R
Because the Dashboard's CSV export is long-format, getting from a folder of per-session files to a tidy dataset is short. A typical starter pipeline:
library(tidyverse)
library(jsonlite)
# 1. Ingest a folder of per-session JSON files
files <- list.files("data/", pattern = "\\.json$", full.names = TRUE)
sessions <- map(files, ~ fromJSON(.x, simplifyVector = FALSE))
# 2. Flatten to long format (or just use the Dashboard's master CSV)
df <- read_csv("ema_master_dataset_2026-04-22.csv")
# 3. Compliance per participant
compliance <- df |>
distinct(participant_id, day, session_id) |>
count(participant_id) |>
rename(sessions_completed = n)
# 4. Mean mood by time-of-day, handling the long format
mood <- df |>
filter(question_id == "q1") |>
group_by(participant_id, window_label) |>
summarize(mean_mood = mean(response_numeric, na.rm = TRUE),
sd_mood = sd(response_numeric, na.rm = TRUE),
n = n(),
.groups = "drop")
# 5. Affect Grid (stored as "valence;arousal" in response_value)
affect <- df |>
filter(question_type == "affect_grid") |>
separate(response_value, into = c("valence", "arousal"),
sep = ";", convert = TRUE)
Dashboard
dashboard.html is a local analysis UI. Everything parses in your browser; no data is
transmitted anywhere.
What to feed it
The Dashboard accepts two input shapes. You can mix them in a single import — the parser sorts out which files are which.
- A folder of JSON files. One per-session
.jsonas produced by the participant app, plus a singleconfig.json(the same one inside your exported study). This is the right shape when participants return files manually, or when you've downloaded them out of REDCap/institutional storage. - A CSV from the Google Sheets webhook. BETA If
you're using the Google Apps Script webhook recipe, your sheet has a
Raw JSONcolumn holding each session's payload. Export the sheet as.csv(File → Download → CSV) and import it alongside yourconfig.json. The parser reads each row'sRaw JSONcell as a full session payload.
Config caching. After your first successful import the Dashboard caches your
config.json in browser storage. On subsequent visits you only need to drop in the data
file(s) — the config is remembered. Clearing site data resets this.
Once imported, the Dashboard renders:
KPIs
- Avg. Compliance Rate — completed sessions / expected sessions, scoped by filters.
- Total Pings Delivered — expected sessions up to the current study day.
- Signal Noise / Invalid % — proportion of sessions flagged as "speeding" (total session duration < 30 seconds). A rough heuristic; tune in your own analysis.
- Avg. Time to Complete — mean session duration.
Filters & views
- Aggregate vs. Per-Participant segmented view.
- Date range (by study day).
- Exclude missed prompts toggle (affects denominator of compliance).
- Filter rapid responders (<30s) toggle.
- Attention Required watchlist — participants whose compliance is below threshold, ready for a check-in text.
- Export Master CSV — flattens all imported sessions into one long-format file (schema above).
Latency is currently placeholder. The Dashboard has fields for "latency" (time from prompt delivery to link open), but with no server, the runtime cannot know when the link was sent. This field is zero-filled until the Twilio/webhook integration lands, which will give the parser an external prompt-delivery timestamp to merge in.
Heart-Rate Capture (Camera PPG)
EMA Forge ships a lightweight photoplethysmography (PPG) pipeline that recovers heart rate from the participant's rear-facing smartphone camera, with the torch on as an illumination source. It powers two user-facing features:
- A
heart_ratequestion type you can drop into an EMA block anywhere. - A standalone HR step in a window's phase sequence (identical capture, structural placement differs).
How the PPG pipeline works (click to expand)
The runtime's ePATCore module handles PPG end-to-end. The current pipeline is the v5
architecture; an engineering-level deep-dive of what changed since v4 is in the next collapsible. In
brief:
- Camera acquisition.
getUserMediarequests the rear camera at 30 fps (with 60 fps and "ideal 30" fallbacks) and the torch is engaged viaapplyConstraints({ torch: true }). A hidden<video>element receives the stream; cameras labeled "front", "facetime", "dual", "triple", "ultra", or "tele" are deprioritised so the chosen camera is the standard rear sensor. - Sampling. A hidden
<canvas>draws each video frame and averages pixel intensity over a central region of interest. Both the red and green channels are extracted and tracked separately — the red channel dominates with the torch on, but the green channel is kept available as a backup. Sampling is locked to the camera's actual paint rate (typically 30 Hz); a variable-frame-rate deduplication step prevents the 60/120 Hz display refresh from being mistaken for new data. - Filtering. Each channel is bandpass-filtered with a pair of cascaded 2nd-order Butterworth biquad filters: a high-pass at 0.67 Hz to remove respiratory drift and finger micro-shifts, and a low-pass at 3.33 Hz to remove high-frequency sensor noise. The pass band (0.67–3.33 Hz, i.e. 40–200 BPM) covers the cardiac range. The biquads use Butterworth Q (1/√2) and standard digital biquad coefficients computed at the actual sample rate.
- Signal-quality index (SQI). SQI is computed every 1 s on a 2 s rolling
window of the filtered signal as the peak-to-peak amplitude in the cardiac band. This
directly measures the AC pulse amplitude that beat detection actually requires, and is
scale-comparable across the red and green channels. The runtime exposes per-channel SQI plus the
active channel via
onSqiUpdateCb(sqiRed, sqiGreen, activeChannel). The "Sensor Warning" overlay enforces the floor — if SQI on the active channel drops below the warning threshold, the warning reappears and the participant is prompted to reposition their finger. - Channel selection and failover. Both channels run a full filter + beat-detector pipeline continuously. After a 4.5 s settling window following finger detection (which lets the high-pass transient ring down — its time constant is ~0.24 s, so 5τ ≈ 1.2 s with comfortable margin), per-channel baseline amplitudes are locked. The active channel is initially red. If the active channel's amplitude drops below 40% of its baseline AND the backup channel's amplitude exceeds 25% of its own baseline, sustained for two consecutive 1 s checks (anti-thrash), the runtime switches to the backup. Because both detectors are always running, the new active channel has already learned its thresholds — there is no cold-start blackout on switch. This makes the system robust to drift in finger pressure, ambient lighting changes, and skin tone differences that affect torch saturation differently across channels.
- Beat detection. Each channel feeds an independent instance of a JavaScript port of the WABP (Waveform Analysis for Blood Pressure) algorithm — Zong, Heldt, Moody & Mark (2003) — adapted from arterial pressure to camera PPG. WABP is well-suited because both signals share an upstroke-dominant morphology. After a candidate onset is detected, a quadratic interpolation step refines the peak position in the slope-energy curve to sub-frame resolution. The camera is still 30 Hz, but onset times are no longer quantised to ~33 ms steps — IBIs are computed at finer resolution, which matters substantially for HRV-style analyses where ~30 ms quantisation noise is large relative to physiological variability.
- Dicrotic-notch rejection. PPG morphology contains a secondary peak (the dicrotic
notch) per cardiac cycle that can be falsely detected as a beat, inflating HR by up to a factor of two
when it happens systematically. The runtime maintains a running median of recent inter-beat intervals;
candidate beats whose interval is less than 60% of the expected period are rejected as dicrotic.
Rejection counts are exposed in
diagnostics.dicroticRejectsfor quality auditing. - Output. Each accepted beat is emitted via
onBeatCb({ instantBPM, averageBPM, instantPeriod, averagePeriod, time }).averageBPMis the median of the last 10 inter-beat periods; this median is used in preference to the running mean because it is robust to the occasional missed or extra beat that any peak detector will produce in real-world capture conditions. For HR-question outputs, BPM is reported as60 000 / median(IBIs)over the capture window. The full IBI series is stored alongside the BPM value for downstream HRV analysis, plus the mean SQI as a quality annotation.
This is not a medical-grade measurement. For research where the construct is "the participant's approximate HR at this moment under ecological conditions", the technique is well-established in the digital biomarker literature, and the pipeline above is materially closer to PhysioNet-grade processing than the typical smartphone-PPG demo.
v5 architecture: dual-channel + signal-quality (engineering deep-dive)
This section documents what changed in the signal pipeline between v4 and the current v5 architecture, for researchers replicating the methods or auditing the code. None of this is necessary to use EMA Forge — it's here so reviewers and replicators have an accurate target.
Why dual-channel
A single-channel PPG pipeline has a fundamental fragility: the channel's signal-to-noise depends on the interaction of skin tone, finger pressure, ambient light, and torch brightness in ways that drift across a session. v4 used the red channel exclusively; in practice this worked well in most cases but produced occasional sessions where the red channel saturated (torch + light skin + heavy pressure) or under-illuminated (torch + dark skin + light contact), and the entire session became unscoreable. v5 keeps red as the default but tracks the green channel in parallel and switches if conditions warrant. Importantly, the green pipeline is never "off" — both channels run the full filter and beat detector continuously, so the backup channel has already adapted its thresholds at the moment a switch fires.
Why the SQI metric changed
v4 computed SQI as a perfusion-index (max − min) / mean on the raw camera
channel. This had two pathologies:
- During finger placement, the raw signal ramps from ambient (~0.10) to finger-on (~0.40) inside the
SQI window.
(max − min) / meanover that ramp is dominated by the DC step, not the AC pulse — green could routinely score 14+ versus red's 0.6, making channel selection meaningless. - The threshold was calibrated against torch-saturated red (where raw PI is ~0.01). Green's baseline raw PI is ~0.35. Any threshold tuned for one channel's operating point is wrong for the other.
v5 SQI is the peak-to-peak amplitude of the filtered signal over a 2 s rolling window.
Because the bandpass strips the DC component, this metric measures the actual AC pulse amplitude — which
is what WABP cares about — and is scale-comparable across channels because both feed the same filter.
The raw perfusion index is still computed and exposed in diagnostics for callers that want
it, but no switching decision uses it.
Calibration state machine
After a finger-on transition, the calibration state machine moves from 'settling' →
'locked'. During settling (4.5 s), no channel switching can occur and no beats are
emitted — this gives the high-pass filter time to ring down past its transient. At lock time, each
channel's current filtered peak-to-peak becomes its baseline. From that point onward, failover
thresholds are expressed as fractions of each channel's own baseline (40% drop on active, 25% viability
on backup), which makes them dimensionless and channel-agnostic. After a switch, the new active
channel's baseline is updated to its current value, so subsequent failovers in either direction work
symmetrically.
Sub-frame interpolation
WABP returns the index of the slope-energy local maximum that triggered the detection. With a
30 Hz camera, naive use of this index quantises beat times to ~33.3 ms. v5 fits a quadratic to
the three samples around the local maximum and computes the fractional offset of the true peak:
Δ = ½ · (s₀ − s₂) / (s₀ − 2s₁ + s₂). The reported framesAgo is then
integer_framesAgo − Δ, so beat times resolve at finer than the frame interval. The camera
sample rate has not changed; the beat-time estimate just no longer has 30 ms quantisation noise on
top of the underlying physiological variability. For downstream HRV use this is meaningful — RMSSD on
quantised IBIs is biased upward.
Dicrotic-notch rejection
The dicrotic notch is the mid-diastolic secondary peak in the PPG waveform. WABP's onset criterion can
fire on it under certain morphologies, generating a phantom beat with an IBI roughly half the true
value. v5 maintains a 10-element running buffer of recent accepted IBIs; once it has at least 3 samples,
candidate beats whose interval to the previous accepted beat is less than 60% of the median expected
period are rejected as dicrotic and counted in dicroticRejectCount. The 60% threshold is
conservative — physiological IBI variability (HRV) under normal conditions is well below that ratio.
API contract notes
BeatDetector.setCallbacks({...})swaps callbacks without resetting filter or detector state. This is what allows ePAT and HCT to keep the camera session alive across baseline → trial transitions without a cold-start.BeatDetector.stop()tears down the camera, releases the torch, and resets all internal state. After this, the next call must bestart().- All beat-time stamps use
performance.now(), which is monotonic and unaffected by wall-clock changes. onSqiUpdateCb(sqiRed, sqiGreen, activeChannel)reports both channels every 1 s; the active channel string is one of'red'or'green'.
Test the pipeline on your device: Open ppgtester.html to run a live camera capture and inspect beat
detection, raw waveform, SQI, and IBI output in real time — without needing a full study session. Useful
for validating that PPG works on a target device before deploying, or for checking signal quality in a
given environment.
ePAT — Ecological Phase Assessment Task
The ePAT is a cardiac interoceptive-accuracy task adapted for in-the-wild, mobile-first use. It is directly descended from the Phase Adjustment Task (PAT) and its refined successor PAT 2.0 — see Acknowledgments & Prior Work for primary sources. Participants align an auditory tone with their own felt heartbeats by rotating a dial until the tone "feels like it's landing on" each beat. Their phase offset (in ms, relative to ground-truth peaks detected by the camera PPG) is the dependent measure.
Task flow, trial by trial (click to expand)
- Baseline calibration. 10–20 s of still finger-on-camera capture establishes the participant's current HR and verifies that SQI is above threshold before any trials begin.
- Two-phase practice (optional, default on). First, tone-to-tone alignment (no heartbeat involved — teaches the dial mechanic). Second, tone-to-heartbeat alignment at a slow scaffolded pace.
- Trial block.
trialsvalid trials (default 20), eachtrial_duration_secseconds long (default 30). During a trial:- Live camera PPG streams in the background.
- A tone plays at a predicted time, offset by a randomized phase.
- The participant rotates the rotary dial to shift the tone earlier or later until it subjectively aligns with their beat.
- On "Confirm Timing", the offset is recorded relative to the ground-truth peak from the PPG.
- Per-trial confidence (optional): 1–5 rating of how sure they were the tone matched their heartbeat.
- Body map (optional): after each trial, where did they feel the beat? Chest / fingers / neck / ears / abdomen / legs / head / nowhere.
- Retry budget.
retry_budget(default 30) is the maximum number of attempts. A trial can fail for low SQI, excessive movement, or participant cancelation. Once valid trials hits the target, the task ends; if the budget exhausts first, the task ends with whatever valid trials were collected.
High-precision timing. Audio stimulus scheduling uses the Web Audio API's
AudioContext (not setTimeout), which gives sample-accurate timing across
browsers. This is the difference between a task that has <2 ms jitter and one that has ~20 ms jitter,
which matters a great deal when your DV is measured in milliseconds.
Configuration knobs (Builder → Tasks → ePAT)
| Setting | Default | Purpose |
|---|---|---|
trials |
20 | Target number of valid trials. |
trial_duration_sec |
30 | Max seconds per trial before auto-cancel. |
retry_budget |
30 | Hard cap on total attempts (valid + failed). |
sqi_threshold |
0.3 | Minimum perfusion index for trial acceptance. |
confidence_ratings |
on | Ask for 1–5 confidence after each trial. |
two_phase_practice |
on | Include tone-to-tone + tone-to-heartbeat practice. |
body_map |
on | Show the post-trial body-location picker. |
ePAT is flagged BETA. The algorithm is stable and the task is usable, but the published-psychometrics validation of the ecological adaptation is still in progress. If you publish ePAT data collected via EMA Forge, please cite both the underlying PAT lineage (Plans et al., 2021; Palmer et al., 2025) and this repository — full citation list in Acknowledgments.
Heartbeat Counting Task (HCT) BETA
The Heartbeat Counting Task is a classic measure of cardiac interoceptive accuracy
(Schandry, 1981). Participants silently count the heartbeats they perceive over a set
of timed intervals. The accuracy score for each interval is
1 - |actual - reported| / actual, where actual is the objective
beat count and reported is the participant's count. Mean accuracy across
intervals is the canonical Schandry score.
EMA Forge's HCT module reuses the same camera-PPG pipeline (ePATCore)
that powers the ePAT and the inline heart-rate question type, so objective beat counts
during each interval are computed with the WABP onset detector and signal-quality
gating described in HR Capture (PPG). Participants hold
their fingertip over the rear camera + flashlight for the duration of the task.
Methodological options (click to expand)
HCT validity has been debated extensively (see Desmedt et al., 2018; Ring & Brener, 2018). The Builder exposes the parameters most likely to matter for that debate:
- Instruction variant. "Count perceived heartbeats" (Schandry's original) versus "Estimate heartbeats" (Brener/Ring). These prompt different cognitive strategies and produce non-equivalent scores; pick deliberately.
- Custom instructions. Free-text override of the variant default, for labs running specific protocols.
- Counting screen visibility. Blank, subtle progress ring, elapsed timer, or both. Hiding the timer is closer to the original Schandry protocol; showing it matches some smartphone-adapted variants.
- Interval set. Comma-separated list. Default is the reduced 25/35/45 s set for EMA contexts; the classic Schandry set is 25, 35, 45, 50, 55, 100. Researcher-supplied sets are honored verbatim.
- Order randomization. Default on; toggle off for fixed-order replications.
- Practice interval. One short practice interval (default 15 s) precedes the real intervals when enabled. Practice trials are recorded but not included in summary statistics.
- Per-interval confidence. 0–10 slider after each interval. Required for Garfinkel et al.'s interoceptive awareness metric (Pearson r between accuracy and confidence across intervals), which the runtime computes when at least three valid (accuracy, confidence) pairs exist.
- Body map. Optional sensation-localization prompt every N intervals, identical to the ePAT body map.
Quality control mirrors ePAT exactly: the same SQI watchdog overlays the same "make the circle red" prompts, and a session-level retry budget silently re-runs intervals that fail signal-quality gating (low SQI for >50% of the interval, or fewer than 5 detected beats).
HCT JSON output shape (click to expand)
{
"type": "hct_response",
"startedAt": "2026-04-30T08:04:11Z",
"baseline": { "recordedHR": [72, 71, ...], "totalBeats": 60, "finalSqi": 0.012, ... },
"intervals": [
{
"intervalIndex": 1, "isPractice": false,
"duration_sec": 35, "durationMs_actual": 35012,
"reportedCount": 32, "actualBeats": 41,
"accuracy": 0.7805,
"confidence": 6, "bodyPos": 1,
"recordedHR": [70, 71, ...], "ibi_series": [810, 795, ...],
"qualitySummary": { "cleanBeats": 39, "flaggedBeats": 2,
"sqiBadSeconds": 0.5, "sqiFinalValue": 0.012, ... }
}, ...
],
"practices": [...],
"summary": {
"valid_intervals": 3, "practice_intervals": 1,
"mean_accuracy": 0.81,
"mean_confidence": 5.7,
"interoceptive_awareness": 0.42,
"mean_sqi": 0.011,
"instruction_variant": "count",
"randomized": true,
"intervals_sec": [25, 35, 45]
}
}
Implicit Association Task (IAT)
The IAT (Greenwald, McGhee & Schwartz, 1998) measures the strength of automatic associations between concept pairs by comparing response times across compatible and incompatible sorting conditions. EMA Forge implements the standard 7-block IAT with D-score scoring (Greenwald, Nosek & Banaji, 2003) computed entirely in the participant's browser at session end. No server, no camera, no hardware requirements beyond a touchscreen.
Because the IAT is RT-based, mobile deployment requires deliberate engineering choices that differ from desktop implementations:
- touchstart, not click. Response time is recorded from
touchstart, which fires at finger contact (~50–100 ms earlier thanclick).clickfires as a fallback for desktop/preview. - rAF-anchored stimulus onset. The stimulus word is written to the
DOM and then
t₀ = performance.now()is captured inside arequestAnimationFramecallback — not at the moment of theinnerHTMLwrite. This ensurest₀reflects pixel-on-screen rather than JS execution time, which can diverge by a full frame (16 ms at 60 Hz) or more under load. - Half-screen tap zones. The left and right halves of the screen are independent invisible buttons. A hairline center divider provides a visual affordance. Tap targets are intentionally the full screen height, not thumb-zone strips, to minimize motor error variance.
D-score algorithm (click to expand)
The implementation follows Greenwald, Nosek & Banaji (2003, JPSP), error replacement strategy B:
- Exclusions. Trials with RT < 300 ms (anticipatory) or > 10,000 ms (disengaged) are flagged as excluded and removed before all subsequent steps.
- Error penalty. For each pairing block set, compute the mean RT of correct trials. Each error trial's RT is replaced by that mean + 600 ms.
- Block means. Mean RT is computed across the error-replaced trial set for blocks 3+4 (pairing 1) and blocks 6+7 (pairing 2).
- Pooled SD. SD is computed across all valid trials from both pairing sets pooled together — not the average of within-block SDs. The same error-replaced RTs are used.
- D = (M₂ − M₁) / SDpooled. Sign is oriented so
positive D = target A + positive attribute faster (compatible). Whether pairing
1 is the compatible condition depends on
block_order_variant(0 or 1), which is logged in the summary and determined byparticipantId % 2.
Fast-responder flag. If more than 10% of trials in any block
have RT < 300 ms, fast_responder: true is set in the
summary. This is a flag, not an auto-exclusion — the convention in the literature
is to report and let the analyst decide.
Block structure (click to expand)
| Block | Content | Default trials | Used for D |
|---|---|---|---|
| 1 | Target A practice | 20 | No |
| 2 | Attribute practice | 20 | No |
| 3 | Combined practice — pairing 1 | 20 | Yes |
| 4 | Combined critical — pairing 1 | 40 | Yes |
| 5 | Target reversal practice | 40 | No |
| 6 | Combined practice — pairing 2 | 20 | Yes |
| 7 | Combined critical — pairing 2 | 40 | Yes |
All trial counts are configurable in the Builder. Blocks 3+4 and 6+7 are the critical blocks used for D-score computation (Greenwald et al., 2003). Practice blocks can be disabled as a group in the Builder, though this is not recommended — participants need to learn the categorization before the combined critical blocks.
Counterbalancing (click to expand)
Which target category pairs with positive attributes in blocks 3/4 is determined
automatically by participantId % 2:
- Variant 0 (even PIDs): Target A + Positive on left in blocks 3/4; Target A + Negative on left in blocks 6/7.
- Variant 1 (odd PIDs): Target A + Negative on left in blocks 3/4; Target A + Positive on left in blocks 6/7.
This gives approximately equal assignment without researcher intervention and
is fully deterministic — you can always reconstruct which order a participant
received from block_order_variant in the output. Include it as a
covariate or between-subjects factor in any model where order effects are a
concern.
IAT JSON output shape (click to expand)
{
"type": "iat_response",
"startedAt": "2026-04-30T08:12:44Z",
"trials": [
{
"block_index": 0,
"block_id": 1,
"block_label": "Practice: Flowers",
"pairing_id": null,
"critical_for_d": false,
"trial_n_in_block": 1,
"trial_n_overall": 1,
"stimulus": "Orchid",
"category": "target_a",
"correct_side": "left",
"response_side": "left",
"correct": true,
"rt_ms": 621.4,
"excluded": false,
"exclude_reason": null,
"timestamp": "2026-04-30T08:12:45Z"
},
...
],
"summary": {
"d_score": 0.482,
"d_mean_pairing1": 712.3,
"d_mean_pairing2": 891.6,
"d_sd_pooled": 372.1,
"d_n_pairing1": 118,
"d_n_pairing2": 117,
"block_order_variant": 0,
"fast_responder": false,
"total_trials": 200,
"excluded_trials": 3,
"exclusion_rate": 0.015,
"target_a_label": "Flowers",
"target_b_label": "Insects",
"attr_pos_label": "Pleasant",
"attr_neg_label": "Unpleasant",
"block_stats": [
{
"block_id": 1,
"n_total": 20, "n_valid": 20, "n_excluded": 0,
"n_errors": 2, "error_rate": 0.1,
"mean_rt": 698.2,
"fast_trial_count": 0, "fast_trial_rate": 0.0
},
...
]
}
}
In R, to get one row per participant with the D-score and block order:
library(jsonlite)
library(dplyr)
library(purrr)
sessions <- list.files("data/", pattern = "\\.json$", full.names = TRUE) |>
map(read_json)
iat_summary <- sessions |>
map_dfr(function(s) {
iat <- keep(s$data, ~ .$type == "iat_response")
if (!length(iat)) return(NULL)
sm <- iat[[1]]$summary
tibble(
pid = s$participantId,
day = s$day,
d_score = sm$d_score,
block_order = sm$block_order_variant,
fast_responder = sm$fast_responder,
exclusion_rate = sm$exclusion_rate,
mean_rt_p1 = sm$d_mean_pairing1,
mean_rt_p2 = sm$d_mean_pairing2
)
})
# Trial-level data (for block-by-block RT analysis)
iat_trials <- sessions |>
map_dfr(function(s) {
iat <- keep(s$data, ~ .$type == "iat_response")
if (!length(iat)) return(NULL)
iat[[1]]$trials |>
map_dfr(as_tibble) |>
mutate(pid = s$participantId, day = s$day)
})
Conditional Tasks
A task step in a window's phase sequence can carry a condition that gates whether it runs at all:
{ kind: "task", id: "epat",
condition: { question_id: "q_hr_1", operator: "gt", value: 80 } }
At runtime, EMA Forge evaluates the condition against the participant's responses collected earlier in the same session. If the condition is false, the step is silently skipped — as if it were never in the sequence. Typical use cases:
- "Only run ePAT if resting HR is elevated" — gate on a prior HR capture's BPM.
- "Only show the post-task stress items if the participant reported feeling stressed beforehand" — gate on a slider threshold.
- "Skip the cognitive task on the 3rd daily window" — gate on window ID.
Privacy Model
Written plainly, because IRBs will ask you to restate this in your protocol:
- Response data transmission is opt-in. By default, data is generated, stored, and downloaded entirely on the participant's device, and you receive it via whatever channel the participant uses to return files. If the researcher configures a webhook (Study tab → Webhook URL), the session JSON is POSTed to that specific endpoint at session end instead. No transmission happens to any endpoint other than the one you configure. The webhook field is off by default.
- The static host sees page-load requests. Your GitHub Pages / Netlify / institutional
host will receive HTTP requests when the participant opens a link. These requests include timestamp, IP,
User-Agent, and the full URL (including
?id=). If participant ID alone is identifiable, choose an unlinkable ID (short opaque strings, not names or MRNs). - No third-party analytics, no CDN fetches, no font-provider calls. The exported study has zero outbound network calls at runtime other than (a) the initial page load from your host, and (b) the single webhook POST at session end if configured. You can verify this in Chrome DevTools → Network.
- Camera / microphone access (HR, ePAT) is handled entirely through the browser's
getUserMediapermission prompt. The media stream never leaves the device; only the derived signal (BPM, IBIs, phase offsets) is stored. - Consent is captured as a boolean plus a timestamp in the onboarding JSON payload. The full consent text as the participant saw it is logged alongside.
IRB Boilerplate
Two variants depending on your data-return design — use the one that matches your protocol.
Variant A: Manual participant return (no webhook)
Participant response data will be collected via a custom web-based application (EMA Forge, an open-source static web tool). The application operates entirely in the participant's mobile browser: response data is generated and stored on the participant's device and is not transmitted to any central server in the course of normal operation. Participants return completed session files to the research team via [INSTITUTIONAL SECURE CHANNEL — e.g., REDCap file upload, institutional SFTP]. No protected health information or direct identifiers are included in the study data; participants are identified only by an opaque study ID. Physiological measurements (heart rate via photoplethysmography) are derived from the participant's smartphone camera locally; raw video is not stored or transmitted. The application source code is open and available for review at github.com/keeganwhitacre/emaforge.
Variant B: Webhook auto-upload to a defined endpoint
Participant response data will be collected via a custom web-based application (EMA Forge, an open-source static web tool). The application operates in the participant's mobile browser and, at the end of each session, transmits the session's response data via HTTPS POST to a single pre-configured endpoint controlled by the research team at [ENDPOINT DESCRIPTION — e.g., an institutionally-hosted endpoint writing to a secured REDCap project / an institutional Google Workspace sheet under the PI's account / an AWS Lambda writing to an encrypted S3 bucket in the institution's AWS organization]. No data is transmitted to any other endpoint. If network transmission fails, the application falls back to on-device storage and manual return via [FALLBACK CHANNEL]. No protected health information or direct identifiers are included in the study data; participants are identified only by an opaque study ID. Physiological measurements (heart rate via photoplethysmography) are derived from the participant's smartphone camera locally; raw video is not stored or transmitted. The application source code is open and available for review at github.com/keeganwhitacre/emaforge.
Known Limitations
- No live enrollment dashboard. Data is only visible once files are returned.
- Missed-ping detection is derived, not direct. The app cannot know a prompt was sent unless you record that externally.
- Device heterogeneity. HR/ePAT quality depends on camera sensor, torch brightness, and case thickness. Pilot across a range of devices before locking your protocol.
- Browser storage can be cleared. A participant who "clears website data" mid-study loses their in-progress resume state (but not already-downloaded session files). The completion lock also resets.
- Single-file exports don't let you patch typos. Use the static-hosting bundle if you anticipate edits after deployment.
- Accessibility is a work in progress. Screen-reader support for sliders and the affect grid is incomplete. Audit against your accessibility requirements before broad enrollment.
Roadmap
Auto-upload with Webhooks STABLE
By default, each session ends with the participant tapping "Save Local Copy" and emailing the resulting file to your lab. This works, but it has two costs: it depends on the participant remembering the step, and it depends on your return channel being IRB-approved.
The webhook option removes both. When you paste a URL into the Webhook URL field in the Builder's Study tab, the exported study silently POSTs the full session JSON to that URL at save-time. The participant sees "✓ Data uploaded successfully" instead of a download prompt. If the upload fails (no network, endpoint down), the manual download button automatically re-appears — data is never lost silently.
Design tradeoff worth understanding: enabling a webhook means you are now transmitting response data to an endpoint of your choosing. The core "serverless" guarantee (data stays on-device) is opt-in to preserve. If your IRB protocol specifies manual return, leave the field blank. If your IRB approves direct electronic return to a specific institutional endpoint, webhooks make the data pipeline hands-off for participants.
What gets sent
The runtime POSTs a JSON body to your endpoint with this shape:
{
"participant_id": "104",
"day": 2,
"window_id": "morning",
"session_data": {
"participantId": "104",
"sessionId": "s_a1b2c3",
"day": 2,
"type": "morning",
"status": "complete",
"startedAt": "2026-04-22T08:03:11Z",
"completedAt": "2026-04-22T08:06:48Z",
"data": [ /* ema_response, epat_response, etc. — full payload */ ]
}
}
The top four fields are duplicated out of session_data for easy dispatch/filtering on your
receiving end. The session_data object is the same JSON documented in JSON Schema.
Content-Type is text/plain, not application/json. This is
deliberate — it's the standard workaround that keeps the browser from firing a CORS preflight (OPTIONS)
request, which Google Apps Script and many other simple receivers cannot answer. Your endpoint must
parse the body as JSON despite the text/plain header. Every code example below does this.
Webhook recipes by receiver
EMA Forge's webhook is a plain HTTPS POST with a JSON body. Anything that can accept that and respond with a 2xx works. The table below summarises the recipes documented in this section; pick the one whose tradeoffs match your IRB, your IT environment, and your willingness to do setup work.
| Receiver | Cost | Setup | When to pick it |
|---|---|---|---|
| Google Apps Script → Sheets | Free | ~5 min | Default starting point. Personal Google account is the only requirement. |
| Cloudflare Worker → R2 | Free up to 100k req/day | ~15 min | One file per session, no row-size limits, no Google account, no cold starts. |
| Azure Function → SharePoint | Free (consumption tier) | ~30 min + IT signoff | Institutional Microsoft 365 mandate. Strongest IRB story for university-managed tenants. |
| Power Automate → Excel | Premium licence required | ~10 min | You already have Power Automate Premium. Simpler than the Azure path if you do. |
| PHP proxy → REDCap | Free (uses existing lab server) | ~30 min + lab IT | IRB requires data stay on institutional infrastructure end-to-end. |
| AWS Lambda → S3 | ≈$0–3/month | ~45 min | Lab is standardised on AWS, or you want presigned-URL access patterns. |
Google Sheets via Apps Script STABLE
The fastest path to a working pipeline — no server, no billing, no institutional signoff beyond your existing Google account. Sessions fill a Google Sheet one row at a time as participants complete them.
Full step-by-step setup (click to expand)
- Open sheets.google.com and create a new blank spreadsheet. Name it whatever you want your study data labelled as.
- From the menu bar, Extensions → Apps Script.
- Delete the placeholder
function myFunction() {}and paste the script below. - Save. Name the project anything.
- Deploy → New deployment. Choose type Web app.
- Set Execute as: Me, Who has access: Anyone. Necessary because the exported study has no Google credentials. Your endpoint URL is effectively the credential — treat it as a secret.
- Click Deploy. Authorize when prompted. Copy the Web app URL
(starts with
https://script.google.com/macros/s/...). - Paste into EMA Forge's Builder → Study tab → Webhook URL field.
- Re-export. Test end-to-end with a dummy participant ID.
function doPost(e) {
try {
const data = JSON.parse(e.postData.contents);
const sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
if (sheet.getLastRow() === 0) {
sheet.appendRow(['Timestamp', 'Participant ID', 'Day', 'Window', 'Raw JSON']);
sheet.getRange(1, 1, 1, 5).setFontWeight("bold");
sheet.setFrozenRows(1);
}
sheet.appendRow([
new Date(),
data.participant_id || 'Unknown',
data.day || 'Unknown',
data.window_id || 'Unknown',
JSON.stringify(data.session_data || data)
]);
return ContentService
.createTextOutput(JSON.stringify({status: 'success'}))
.setMimeType(ContentService.MimeType.JSON);
} catch (error) {
return ContentService
.createTextOutput(JSON.stringify({error: error.message}))
.setMimeType(ContentService.MimeType.JSON);
}
}
When editing, redeploy via Manage deployments → Edit → New version (keeps the same URL).
Apps Script rate limits are generous (~1000 sessions/day before issues). Beyond that, switch to Cloudflare or Azure.
Cloudflare Worker → R2 object storage STABLE
Cloudflare Workers are free up to 100,000 requests per day. R2 (Cloudflare's S3-compatible object storage) is free up to 10 GB stored with no egress fees. This combination scales further than Apps Script, stores one full JSON per session (so signal-heavy ePAT sessions don't bump against any row-size cap), and has no cold-start latency.
Full step-by-step setup (click to expand)
- Create a free Cloudflare account at cloudflare.com.
- Install
wranglerCLI:npm install -g wrangler. wrangler loginto authenticate.wrangler r2 bucket create ema-forge-sessions- Create a directory
ema-receiver/withwrangler.toml:name = "ema-receiver" main = "worker.js" compatibility_date = "2025-01-01" [[r2_buckets]] binding = "EMA_BUCKET" bucket_name = "ema-forge-sessions" [vars] AUTH_SECRET = "" # optional — set to a random string to require ?auth= on POSTs - Create
worker.jswith the code below. wrangler deploy. Copy the workers.dev URL it prints.- Paste into EMA Forge's Webhook URL field. Re-export.
export default {
async fetch(request, env) {
const cors = {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'POST, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type',
};
if (request.method === 'OPTIONS') return new Response(null, { headers: cors });
if (request.method !== 'POST') return new Response('POST only', { status: 405, headers: cors });
const url = new URL(request.url);
if (env.AUTH_SECRET && url.searchParams.get('auth') !== env.AUTH_SECRET) {
return new Response('unauthorized', { status: 401, headers: cors });
}
let body;
try { body = await request.json(); }
catch {
return new Response(JSON.stringify({error: 'bad json'}), {
status: 400, headers: { ...cors, 'Content-Type': 'application/json' }
});
}
const pid = String(body.participant_id || 'unknown');
const day = String(body.day || '0');
const win = String(body.window_id || 'na');
const ts = new Date().toISOString().replace(/[:.]/g, '-');
const key = `sessions/${pid}/day-${day}/${win}_${ts}.json`;
await env.EMA_BUCKET.put(key, JSON.stringify(body), {
httpMetadata: { contentType: 'application/json' },
customMetadata: { pid, day, win }
});
return new Response(JSON.stringify({status: 'success', key}), {
status: 200, headers: { ...cors, 'Content-Type': 'application/json' }
});
}
};
To pull data for analysis, R2 speaks S3 API. From R:
library(aws.s3)
Sys.setenv(
AWS_ACCESS_KEY_ID = "<your R2 access key>",
AWS_SECRET_ACCESS_KEY = "<your R2 secret key>",
AWS_S3_ENDPOINT = "<account_id>.r2.cloudflarestorage.com"
)
# Sync entire bucket to local
files <- get_bucket_df(bucket = "ema-forge-sessions", region = "auto")
for (key in files$Key) {
save_object(key, bucket = "ema-forge-sessions",
file = file.path("data", basename(key)))
}
Generate R2 API tokens at Cloudflare dashboard → R2 → Manage R2 API Tokens.
Azure Function → SharePoint List STABLE
The institutional Microsoft path. Best fit when your university mandates Microsoft 365 for research data storage, when your IRB prefers data stay inside the institutional tenant, or when your IT contact will resist approving non-Microsoft cloud services. Azure Functions on the consumption plan are free up to 1 million executions per month, which is well above any plausible EMA workload.
This recipe requires a one-time app registration in Azure AD, which
typically means asking your central IT to grant the
Sites.ReadWrite.All application permission. Most university
IT shops process this routinely.
Full step-by-step setup (click to expand)
One-time Azure AD setup
- Azure Portal → App Registrations → New registration.
Name:
EMA-Forge-Receiver. Account type: single tenant. Click Register. - From the new app's overview, copy Application (client) ID and Directory (tenant) ID.
- Certificates & secrets → New client secret. Copy the secret Value immediately — it's never shown again.
- API permissions → Add a permission → Microsoft Graph → Application permissions → Sites.ReadWrite.All. Add. Then Grant admin consent (your IT will need to do this click; the rest you can do yourself).
SharePoint list setup
- In your team's SharePoint site, create a new List
called
EMA Sessions. - Add columns:
Participant— Single line of textDay— NumberWindow— Single line of textSessionJSON— Multiple lines of text, plain text, increase limit to 250,000 charactersReceivedAt— Date and Time
- Get the site ID and list ID via Graph Explorer:
- Site ID:
GET https://graph.microsoft.com/v1.0/sites/<tenant>.sharepoint.com:/sites/<site-name> - List ID:
GET https://graph.microsoft.com/v1.0/sites/<site-id>/lists— find your list by displayName
- Site ID:
Azure Function deployment
- Install Azure Functions Core Tools and the Azure CLI.
- Create a new function project:
func init ema-receiver --javascript --model V4 cd ema-receiver func new --name SaveSession --template "HTTP trigger" --authlevel "function" - Replace
src/functions/SaveSession.jswith:
const { app } = require('@azure/functions');
const { Client } = require('@microsoft/microsoft-graph-client');
const { ClientSecretCredential } = require('@azure/identity');
require('isomorphic-fetch');
app.http('SaveSession', {
methods: ['POST', 'OPTIONS'],
authLevel: 'function',
handler: async (req, ctx) => {
const cors = {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'POST, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type',
};
if (req.method === 'OPTIONS') return { status: 200, headers: cors };
try {
const raw = await req.text();
const data = JSON.parse(raw);
const credential = new ClientSecretCredential(
process.env.TENANT_ID,
process.env.CLIENT_ID,
process.env.CLIENT_SECRET
);
const graph = Client.initWithMiddleware({
authProvider: {
getAccessToken: async () =>
(await credential.getToken('https://graph.microsoft.com/.default')).token
}
});
const siteId = process.env.SHAREPOINT_SITE_ID;
const listId = process.env.SHAREPOINT_LIST_ID;
await graph
.api(`/sites/${siteId}/lists/${listId}/items`)
.post({
fields: {
Title: `${data.participant_id}_D${data.day}_${data.window_id}`,
Participant: String(data.participant_id || ''),
Day: Number(data.day) || 0,
Window: String(data.window_id || ''),
ReceivedAt: new Date().toISOString(),
SessionJSON: JSON.stringify(data.session_data || {}).slice(0, 250000)
}
});
return {
status: 200,
headers: { ...cors, 'Content-Type': 'application/json' },
body: JSON.stringify({ status: 'success' })
};
} catch (err) {
ctx.error(err);
return {
status: 500,
headers: { ...cors, 'Content-Type': 'application/json' },
body: JSON.stringify({ error: err.message })
};
}
}
});
npm install @azure/functions @azure/identity @microsoft/microsoft-graph-client isomorphic-fetch
- Deploy:
az login az functionapp create --resource-group <rg> --consumption-plan-location <region> \ --runtime node --runtime-version 20 --functions-version 4 \ --name ema-receiver-<you> --storage-account <storage> func azure functionapp publish ema-receiver-<you> - Set environment variables in the Function App Configuration:
TENANT_ID,CLIENT_ID,CLIENT_SECRET,SHAREPOINT_SITE_ID,SHAREPOINT_LIST_ID. - Get the function URL (includes a
?code=key acting as a shared secret): Azure Portal → your Function → Functions → SaveSession → Get Function Url. - Paste into EMA Forge's Webhook URL field. Re-export.
For Ohio State users specifically
OSU permits use of the university-configured Azure tenant for non-medical-center research data classified at S3 (Private) or below. Behavioural EMA data with coded participant IDs and no PHI is S3. The OSU Azure tenant lives under the university's Microsoft enterprise agreement and meets OSU's Information Security Standard for S3 data without requiring a separate Information Security Risk Assessment. Email your college Security Coordinator before submitting your IRB protocol to confirm the setup matches their expectations for your specific data.
Power Automate → Excel STABLE
Simpler than the Azure Function path, but the "When an HTTP request is received" trigger is a Premium Power Automate connector that most institutional Microsoft 365 plans do not include. Before investing time, check with IT whether your Microsoft 365 plan has Premium Power Automate. If not, use the Azure Function path instead.
Full step-by-step setup (click to expand)
Prepare the Excel destination
- Create a new Excel file in OneDrive for Business:
EMA-Forge-Data.xlsx. - Add headers in row 1:
Timestamp | Participant_ID | Day | Window | Raw_JSON. - Select the range and press Ctrl+T to format as a table.
Name the table
SessionData. Power Automate's Excel connector cannot see unformatted ranges — the table step is required.
Build the flow
- Go to make.powerautomate.com. New flow → Instant cloud flow → "When an HTTP request is received".
- In the trigger, paste this Request Body JSON Schema:
{ "type": "object", "properties": { "participant_id": {"type": "string"}, "day": {"type": "integer"}, "window_id": {"type": "string"}, "session_data": {"type": "object"} } } - Add action: Compose. Inputs:
@{string(triggerBody()?['session_data'])}. This stringifies the nested session data so Excel can store it as a single cell. - Add action: Excel Online (Business) → Add a row into a table.
- Location: OneDrive for Business
- Document Library: OneDrive
- File:
/EMA-Forge-Data.xlsx - Table:
SessionData - Row fields:
- Timestamp:
@{utcNow()} - Participant_ID:
@{triggerBody()?['participant_id']} - Day:
@{triggerBody()?['day']} - Window:
@{triggerBody()?['window_id']} - Raw_JSON:
@{outputs('Compose')}
- Timestamp:
- Add action: Response. Status code: 200. Body:
{"status": "success"}. Must be the final action. - Save the flow. Click back into the trigger card — the HTTP POST URL field now contains the endpoint URL. Copy it.
- Paste into EMA Forge's Webhook URL field. Re-export.
Performance caveat: Power Automate must return the Response within 120 seconds or the flow runs asynchronously and the runtime can't tell whether the save succeeded. For standard EMA payloads this is never a problem; for sessions with raw PPG samples (~MB-scale), monitor for timeouts during pilot.
REDCap via institutional PHP proxy STABLE
Direct browser-to-REDCap POSTs are blocked by CORS on most REDCap installations. The realistic path is a one-file PHP proxy hosted on your lab's institutional web server, which accepts the POST from the participant's phone and forwards it to REDCap's API using your project token. This is the path most IRBs prefer because the data path is institutional end-to-end — the participant's phone hits a university domain, which forwards to a university REDCap instance, with no external cloud services involved.
Full step-by-step setup (click to expand)
REDCap project setup
- In your REDCap project, enable the API via Project Setup → API.
- User Rights → assign yourself API rights → generate an API token. Copy it — you'll need it in the PHP file. Treat as a credential.
- Online Designer → create a repeatable instrument called
session. Add fields:session_day— Text Boxsession_window— Text Boxsession_json— Notes Box (long text — verify your REDCap install allows >30k chars per field)session_received— Text Box with date/time validation
PHP proxy
Drop this on your lab's Apache/nginx server. Most psych departments have one — ask your IT contact. The file is dependency-free vanilla PHP and works on any version ≥7.4.
<?php
// ema-forge-redcap-proxy.php
// Forwards EMA Forge sessions into a REDCap project's repeatable instrument.
header('Access-Control-Allow-Origin: *');
header('Access-Control-Allow-Methods: POST, OPTIONS');
header('Access-Control-Allow-Headers: Content-Type');
header('Content-Type: application/json');
if ($_SERVER['REQUEST_METHOD'] === 'OPTIONS') {
http_response_code(200);
exit;
}
$REDCAP_URL = getenv('REDCAP_URL') ?: 'https://redcap.your-uni.edu/api/';
$REDCAP_TOKEN = getenv('REDCAP_TOKEN') ?: 'PASTE_YOUR_TOKEN_HERE';
$SHARED_SECRET = getenv('EMA_SECRET') ?: ''; // optional, recommended
if ($SHARED_SECRET && ($_GET['auth'] ?? '') !== $SHARED_SECRET) {
http_response_code(401);
echo json_encode(['error' => 'unauthorized']);
exit;
}
$body = file_get_contents('php://input');
$data = json_decode($body, true);
if (!$data) {
http_response_code(400);
echo json_encode(['error' => 'invalid JSON']);
exit;
}
$record = [[
'record_id' => $data['participant_id'] ?? 'unknown',
'redcap_event_name' => 'ema_arm_1', // edit to match your project
'redcap_repeat_instrument' => 'session',
'redcap_repeat_instance' => 'new', // REDCap auto-assigns
'session_day' => $data['day'] ?? '',
'session_window' => $data['window_id'] ?? '',
'session_json' => json_encode($data['session_data'] ?? []),
'session_received' => date('Y-m-d H:i:s'),
]];
$payload = http_build_query([
'token' => $REDCAP_TOKEN,
'content' => 'record',
'format' => 'json',
'type' => 'flat',
'data' => json_encode($record),
'overwriteBehavior' => 'normal',
'forceAutoNumber' => 'false',
'returnContent' => 'count',
]);
$ch = curl_init($REDCAP_URL);
curl_setopt_array($ch, [
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_SSL_VERIFYPEER => true,
CURLOPT_TIMEOUT => 30,
]);
$result = curl_exec($ch);
$code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($code >= 200 && $code < 300) {
http_response_code(200);
echo json_encode(['status' => 'success']);
} else {
http_response_code(502);
echo json_encode(['error' => 'redcap rejected', 'code' => $code]);
}
- Have lab IT place the file behind HTTPS at e.g.
https://yourlab.dept.osu.edu/ema-proxy.php. - Set environment variables (Apache:
SetEnvin the vhost config; nginx + PHP-FPM:env[REDCAP_TOKEN] = ...in pool config). - Paste
https://yourlab.dept.osu.edu/ema-proxy.php?auth=<shared-secret>into EMA Forge's Webhook URL field. - Re-export. Test end-to-end.
Edit the redcap_event_name to match your project's
arm/event scheme — REDCap will reject the record otherwise.
AWS Lambda → S3 STABLE
For labs standardised on AWS. Cost is roughly $0–3 per month at EMA scale: Lambda's free tier covers ~1M requests/month indefinitely, and S3 storage runs about $0.023 per GB-month. Output shape mirrors the Cloudflare R2 recipe (one JSON file per session, partitioned by participant/day), so analysis pipelines are interchangeable.
Full step-by-step setup (click to expand)
- Create an S3 bucket:
aws s3 mb s3://ema-forge-sessions-<you>. - Block all public access on the bucket (default is correct).
- Create a Lambda function (Node.js 20.x runtime) with this code:
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
const s3 = new S3Client({ region: process.env.AWS_REGION });
const BUCKET = process.env.BUCKET_NAME;
export const handler = async (event) => {
const cors = {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'POST, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type',
};
if (event.requestContext?.http?.method === 'OPTIONS') {
return { statusCode: 200, headers: cors };
}
try {
const body = JSON.parse(event.body);
const pid = String(body.participant_id || 'unknown');
const day = String(body.day || '0');
const win = String(body.window_id || 'na');
const ts = Date.now();
const key = `sessions/${pid}/day-${day}/${win}_${ts}.json`;
await s3.send(new PutObjectCommand({
Bucket: BUCKET,
Key: key,
Body: JSON.stringify(body),
ContentType: 'application/json',
Metadata: { participantid: pid, day, windowid: win }
}));
return {
statusCode: 200,
headers: { ...cors, 'Content-Type': 'application/json' },
body: JSON.stringify({ status: 'success', key })
};
} catch (err) {
return {
statusCode: 500,
headers: { ...cors, 'Content-Type': 'application/json' },
body: JSON.stringify({ error: err.message })
};
}
};
- Lambda execution role: attach an inline policy allowing
s3:PutObjectonarn:aws:s3:::ema-forge-sessions-<you>/*. - Environment variables:
BUCKET_NAME. - Add an API Gateway HTTP API trigger (not REST API — HTTP API is
cheaper and simpler). Configure a default route
POST /. - Enable CORS on the API: allow origin
*, methodsPOST, OPTIONS, headersContent-Type. - Copy the invoke URL. Paste into EMA Forge's Webhook URL field.
For analysis, sync via aws-cli or
aws.s3::save_object() in R. The bucket layout
(sessions/<pid>/day-<n>/<window>_<ts>.json)
makes per-participant or per-day pulls trivial.
Cautions and gotchas
- Your endpoint URL is effectively a credential. Anyone who has it can post synthetic
data into your dataset. For serious studies, add a shared secret — have the study POST an
?auth=...query parameter that the script checks, rotate it if leaked. - Google Apps Script rate limits exist but are generous for EMA-scale traffic. Expect no issues below ~1000 participant-sessions per day. Beyond that, batch or switch to a dedicated endpoint.
- Apps Script responses don't always reach the client cleanly. The runtime treats any
response.okas success; if Apps Script happens to return a redirect to a final content URL, the fetch still counts as a success. Your row will appear in the sheet regardless. This is fine for the "did it save?" question; if you need byte-exact response parsing, use a real endpoint. - If you change the webhook URL mid-study, participants who cached the old exported HTML will continue to post to the old URL until they clear site data or you push a new export. Make webhook changes at export boundaries, not mid-enrollment.
- Webhooks do not eliminate the manual-download safety net. By design. If connectivity fails at session end, the "Save Local Copy" button reappears and the participant can still return the file by your IRB-approved channel.
Twilio Integration BETA
Beta — functional but not yet externally piloted. The dispatcher handles DST transitions, dedupes sends against a hidden log sheet, and enforces link expiry correctly. It has not yet run a full multi-participant study end-to-end. Pilot on your own phone before enrolling.
EMA Forge can now emit a ready-to-deploy Google Apps Script that dispatches scheduled SMS to participants via Twilio. It's a serverless companion to the serverless study app: the whole pipeline — scheduling, sending, retry, dedupe — lives in a spreadsheet attached to your Google account.
From the Builder's Deployment tab, alongside the existing Generate Routing CSV
button, an Export Twilio Dispatcher (.gs) button produces a single .gs file
pre-configured with your study's URL, study length, expiry window, and schedule. You paste it into an Apps
Script project attached to a Google Sheet and it does the rest.
Setup
- In the Builder's Deployment tab, set the Hosted URL to where your exported study lives (e.g.
https://your-lab.github.io/study/). - Click Export Twilio Dispatcher (.gs). You'll get a file named like
my-study-twilio-dispatcher.gs. - Open sheets.google.com and create a blank spreadsheet. Extensions → Apps Script.
- Delete the placeholder
myFunction. Paste the entire contents of the.gsfile. Save. - Close and reopen the spreadsheet. You'll see a new EMA Forge 🛠️ menu.
- Click 1. Setup Twilio & Roster. Supply your Twilio Account SID, Auth Token, and Twilio
phone number (E.164 format, e.g.
+15551234567). The wizard creates a Roster sheet with the schema below, plus a hidden_Dispatch_Logsheet for audit and dedupe. - Fill in the Roster. One row per participant.
- Click 2. Start Automation (every 15 min). A time-based trigger is installed that runs
dispatchPromptsevery fifteen minutes.
Roster schema
| Column | What to put in it |
|---|---|
Participant_ID |
Matches the ?id= parameter in the generated links. Must be unique. |
Phone |
E.164 format (+15551234567). Numbers without the country code will be rejected by
Twilio. |
Timezone |
IANA zone string (America/Los_Angeles, Europe/London,
Asia/Tokyo). Not a numeric offset — DST transitions would silently break
a fixed-offset scheme twice a year.
|
Start_Date |
YYYY-MM-DD or MM/DD/YYYY. Interpreted as midnight in the participant's
timezone. |
Status |
Active to enroll. Paused to stop sends (e.g. if a participant replies STOP
— see two-way caveat below). Completed is set automatically when the last day's last
window has fired. |
Current_Day, Next_Window, Next_Ping_ISO,
Last_Sent_ISO
|
Managed by the dispatcher. Don't write to these manually unless you're intentionally rescheduling. |
Columns are looked up by header name, not position. You can insert a Notes column
or reorder existing columns without breaking anything. The menu's Setup command reconciles
missing headers non-destructively on each run.
How dispatch actually works
Every 15 minutes the trigger runs dispatchPrompts, which for each Active participant:
- Computes the participant's current study day in their own timezone. Handles DST correctly — a spring-forward or fall-back won't shift anyone's morning ping.
- If a ping is already scheduled and its time has passed, sends it via Twilio, then clears the scheduled
fields. The URL includes an
&t=timestamp so the study'sexpiry_minutessetting is actually enforced. - Otherwise, schedules the next window: picks the first window (sorted by end-time) whose end is still in the participant's future today, or rolls to tomorrow's first window if they've finished today. Within the window, a random minute is picked — the core behavior any EMA designer needs.
- Writes outcomes to the hidden
_Dispatch_Logsheet with asent:HTTPcodeorfail:HTTPcode:messagemarker. Failures leave the ping scheduled for retry on the next tick; successes are recorded so a subsequent tick can't send a duplicate even if the sheet-write that clearsNext_Ping_ISOsomehow failed mid-operation.
The dedupe key is (Participant_ID, Day, Window_ID) scoped to today in the participant's
timezone. A participant gets at most one SMS per window per day, regardless of what happens to the sheet
cells in between.
Message format
The outbound SMS is templated from your study name and expiry window:
StudyName: time for your check-in. Expires in 60 min. https://.../?id=104&day=2&session=morning&t=1732812000000 Reply STOP to opt out.
The Reply STOP to opt out line is not optional. US A2P 10DLC and Twilio's own messaging policy require clear opt-out language in programmatic SMS. Your IRB will also want it.
Menu reference
| Menu item | What it does |
|---|---|
| 1. Setup Twilio & Roster | Stores credentials, creates/reconciles the Roster and _Dispatch_Log sheets. Safe to
re-run; it won't clobber existing data. |
| 2. Start Automation (every 15 min) | Installs the time trigger. Idempotent — running it twice doesn't double the trigger rate. |
| 3. Pause Automation | Removes the dispatcher's triggers. The Roster is untouched; you can resume later by re-running Start. |
| Send test message to row 2 | One-shot send to whatever phone is in row 2 of the Roster. Use this to verify credentials end-to-end before enrolling. |
Known beta caveats
- Two-way SMS is not wired. Participants who reply STOP will stop receiving messages from
Twilio (Twilio handles the opt-out carrier-side automatically), but the dispatcher won't know and will
keep trying. You'll see the failures accumulate in
_Dispatch_Log. The manual workaround: set that participant'sStatustoPaused. - Credentials are "script-private," not "secure." Anyone with edit access to the Apps Script project can read them back with two lines of code. For the typical single-PI setup this is fine; for a shared lab Sheet, restrict Apps Script edit access explicitly.
- Delivery-receipt latency is not captured. The dispatcher records when Twilio accepted the request, not when the carrier delivered the SMS. True ping-to-open latency still needs a separate Twilio status-callback integration (planned).
- Retry back-off is coarse. A transient Twilio failure retries on the next 15-minute tick. If Twilio is down for an hour you'll accumulate four attempts per participant before success. This is fine for Twilio's actual reliability profile but worth knowing.
- Apps Script has a 6-minute execution limit per trigger run. At 15-minute ticks and typical send rates this is never an issue, but if you're running more than a few hundred participants in one dispatcher, split them across multiple sheets.
If you pilot this, please send a note about what broke. The beta caveats above are the ones I already know about; you'll find the ones I don't.
Under the Hood
You do not need any of this to run a study. It's here for the small subset of users who want to audit the code, contribute a new task module, or fork the project for a deeply custom use case.
Repository layout
The repository is split into three concerns: (1) what the researcher uses (Builder + Dashboard), (2)
what gets compiled into the participant's study (templates/), and (3) shared styling.
ema-forge/ ├── index.html # Landing page ├── builder.html # Study authoring environment ├── dashboard.html # Local analysis dashboard ├── readme.html # This file ├── ppgtester.html # Standalone PPG pipeline tester │ ├── css/ │ └── studio.css # Shared design tokens + component styles │ ├── js/ │ ├── state.js # Central state + schema (SCHEMA_VERSION) │ ├── storage.js # LocalStorage + project import/export + migrations │ ├── export.js # Compiles study to HTML / zip bundle │ ├── preview.js # Live iframe preview │ │ │ ├── tabs/ # Builder tab controllers │ │ ├── study.js │ │ ├── onboarding.js │ │ ├── questions.js │ │ ├── schedule.js # Windows + phase_sequence editor │ │ ├── tasks.js # Pluggable module registry │ │ └── deployment.js # Routing CSV generator │ │ │ └── dashboard/ │ ├── parser.js # Ingests participant JSON folder │ └── dashboard.js # Charts, filters, CSV export │ └── templates/ # Source files stitched into the exported study ├── epat-core.js # PPG pipeline + BeatDetector ├── study-base.js # Runtime skeleton ├── module-onboarding.js ├── module-ema.js # EMA block + inline HR capture └── module-epat.js # ePAT task
Key boundary: js/ powers the Builder/Dashboard
(what the researcher interacts with). templates/ is the code that gets stitched into the
participant's study by js/export.js. Never edit templates hoping to change Builder
behavior, or vice versa.
Adding a new task module
The task registry in js/tabs/tasks.js exposes a SETTINGS_RENDERERS object.
Each entry is { html(mod), bind(card, mod) } — html() returns the
settings-panel HTML for the Builder, bind() attaches event listeners. To register a module:
- Add an entry to
state.modulesinjs/state.jswithid,label,desc,enabled, and asettingsobject. - Add a matching renderer in
SETTINGS_RENDERERSinjs/tabs/tasks.js. - Create a runtime module file in
templates/module-<id>.jsthat defines a lifecycle (start,teardown,getData) and push it into the export pipeline injs/export.js.
ePAT is the reference implementation; use it as a template.
Self-hosting the Builder (not recommended)
The Builder is hosted at emaforge.keeganwhitacre.com and that is the recommended way to use it. The hosted version is kept on the current schema and is always up to date; self-hosting introduces version-skew risk (see Schema Versioning) that can corrupt saved projects across researchers working on the same study.
That said, if you need to self-host — institutional policy forbids external tools, you want to run an older schema version indefinitely, or you're developing against the code — clone the repo and serve it with any static-file server:
git clone https://github.com/keeganwhitacre/emaforge.git
cd emaforge
python -m http.server 8000
# then open http://localhost:8000
Camera access requires HTTPS, so for HR/ePAT work you'll need a real certificate — file://
and plain HTTP over non-localhost will both fail.
Credit & Citation
EMA Forge was created by Keegan Whitacre at the Affective Science Lab, The Ohio State University. If you use EMA Forge in a published study, please cite:
Whitacre, K. (2026). EMA Forge: A serverless toolkit for
ecological momentary assessment and digital phenotyping.
Affective Science Lab, The Ohio State University.
https://github.com/keeganwhitacre/emaforge
For ePAT-specific methods, also describe the implementation (PPG pipeline + AudioContext-scheduled stimulus + dial-alignment response) in your Methods section. Issues, pull requests, and replication reports are welcome.
License
EMA Forge is released under the MIT License — free for academic and commercial use, modification, and
redistribution. See LICENSE in the repository for the full text.
Acknowledgments & Prior Work
EMA Forge stands on a substantial body of prior work. The ePAT task in particular is not a novel invention but an ecological adaptation of an established psychophysical paradigm. Proper attribution here matters both scholarly and practically — if you publish using these modules, these are the citations your Methods section owes.
The Phase Adjustment Task (PAT) lineage
The core psychophysical logic of the ePAT — aligning an auditory tone to felt heartbeat via a continuous-adjustment response — is adapted from the Phase Adjustment Task developed by Plans, Ponzo, Morelli, Cairo, Ring, Keating, Cunningham, Catmur, Murphy & Bird, with subsequent refinements in PAT 2.0.
- Plans, D., Ponzo, S., Morelli, D., Cairo, M., Ring, C., Keating, C. T., Cunningham, A. C., Catmur, C., Murphy, J., & Bird, G. (2021). Measuring interoception: The phase adjustment task. Biological Psychology, 165, 108171.
- Palmer, C., Murphy, J., Bird, G., et al. (2025). Refinements of the Phase Adjustment Task (PAT 2.0). Preprint. doi:10.31219/osf.io/4qtwv.
- Original reference implementation (Swift/iOS): huma-engineering/Phase-Adjustment-Task.
The ePAT's contribution is specifically the ecological framing: porting the paradigm to a participant-owned smartphone, using camera PPG rather than a dedicated pulse oximeter, scheduling it within an EMA protocol rather than as a discrete lab session, and treating the resulting phase-offset trajectory as a time-varying individual-difference signal rather than a single-point measurement.
Beat detection — WABP
The PPG peak-detection routine is a JavaScript port of the WABP (Waveform Analysis for Blood Pressure) algorithm, originally developed for arterial blood pressure onset detection and released on PhysioNet. The algorithm generalizes well from arterial pressure to PPG because both signals share the characteristic upstroke-dominant morphology the algorithm was designed around.
- Zong, W., Heldt, T., Moody, G. B., & Mark, R. G. (2003). An open-source algorithm to detect onset of arterial blood pressure pulses. Computers in Cardiology, 30, 259–262.
- Reference C implementation: PhysioNet
wabp.c.
Camera-based PPG acquisition
The browser-side camera acquisition approach — sampling the red channel of a rear-camera video stream under torch illumination — was informed by Richard Moore's open-source demonstrator, which established the feasibility of the pattern in a pure web environment.
- Moore, R. heart-rate-monitor. github.com/richrd/heart-rate-monitor.
Conceptual framework
The decision to operationalize interoceptive accuracy as an ecological, time-varying construct — and to embed it within a broader affective-science EMA protocol — is grounded in the Theory of Constructed Emotion and contemporary work on interoceptive predictive processing. The ePAT is one instrument within that larger program; it is not itself a complete theory.
If you use the ePAT in a published study, please cite the PAT and PAT 2.0 papers alongside EMA Forge. The ePAT is an implementation and ecological adaptation, not an independent paradigm — its validity inherits from that lineage.
Maintained by Keegan Whitacre, OSU Affective Science Lab.