Does HealFrame replace therapy?

No. It's a tool that supports stage-appropriate journaling — it does not replace therapy or diagnosis. Keeping that line bright is part of the design principles this post describes.

How is the crisis-detection pipeline validated?

With an LLM-judge eval harness run against a zero-tolerance bar for missed crisis signals (false negatives) on the eval set — a passing criterion, not a guarantee of zero misses in the wild. For the gaps the evals can't reach, the input stage fails closed: when classification is uncertain, it treats the entry as a crisis.

Development

Should Someone Like Me Be Building a Mental Health App?

이시형6/10/20264 min read

한국어로 읽기

In 2026 I borrowed Judith Herman's Trauma and Recovery from the library. I've long been interested in counseling and psychoanalysis, and there were personal reasons the model didn't read as foreign to me. I'll leave that part there.

What stayed with me was the structure: recovery moves through three stages — establishing safety, remembrance and mourning, reconnection. Reading it, I kept thinking the structure could be the skeleton of a writing tool. If what you should be writing differs by which stage you're in, then a tool could hand you prompts that match your stage. That's how HealFrame started.

The heaviest feature is the least visible one

In a mental health app where AI guides your writing, the technically heaviest part isn't clever prompts. It's reading crisis signals in what the user writes.

HealFrame runs a Gemini-based crisis-detection pipeline that classifies each entry as GREEN, AMBER, or RED. The most important design decision was asymmetry. Input classification fails closed — if the classification is uncertain or the system is degraded, it errs toward the safe interpretation and treats the entry as a crisis. Output fails open — a misbehaving safety check is not allowed to block the response a user needs from reaching them. The cost of missing a crisis and the cost of over-flagging are not symmetric, so the system's failure directions shouldn't be either.

Validation follows the same logic. I built an LLM-judge eval harness that repeatedly tests the pipeline against one non-negotiable bar: zero missed crisis signals on the eval set. Every other metric is up for discussion; that tolerance isn't. I want to be precise about what that bar is, though — it's a passing criterion for my eval set, not proof of zero misses in the wild, and an LLM judging an LLM carries an obvious circularity: the judge itself can be wrong. That's exactly why the asymmetry above matters — for whatever the evals don't reach, the fail-closed input layer that treats uncertainty as crisis is the last layer underneath.

But does it actually work?

That was the engineering part. Here's the honest part.

I'm not the only one building crisis detection. Plenty of AI companies are building it, with far more people and far more data. And still, out in the world, people keep attempting and completing suicide. Between "my eval harness passes" and "this system works on the darkest night of a real person's life" there is a distance I cannot prove across.

And underneath that sits a more basic question. I'm not a PhD. I'm not a clinician. I'm a developer who read a book, cares about this, and has some lived experience. Is someone like that allowed to build an app that touches people's minds? That question never went away while I was building. It still hasn't.

Lines instead of answers

I haven't answered the question. What I have instead is a set of lines I hold while building.

First, this app is not therapy, and it never claims to be — it's a tool that supports stage-appropriate writing, and that's where it stops. Second, the most dangerous failure (a missed crisis) gets a zero-tolerance bar, enforced by an eval harness rather than by good intentions. Third, I try not to forget that I don't know. I've come to see the anxiety of not being an expert less as something to get rid of and more as a safety mechanism — the thing that keeps me building carefully in a domain that demands it.

I still have no certainty that I should be building this. But lately I've come to think that, in a domain like this, building carefully without certainty may be safer than building with it.

If you're going through a hard time right now, you don't have to carry it alone. In the US, call or text 988 (Suicide & Crisis Lifeline). Elsewhere, findahelpline.com lists free, confidential services by country.

#healframe #mental-health #ai-safety #gemini

I Built an App to Fix My Hair-Pulling Habit (Not Fixed Yet)

6/10/2026

My Home Server Is a Laptop

7/17/2026

I'm Building a Sobriety App That Donates My Money If I Fail

6/10/2026

Comments

The heaviest feature is the least visible one

In a mental health app where AI guides your writing, the technically heaviest part isn't clever prompts. It's reading crisis signals in what the user writes.

But does it actually work?

That was the engineering part. Here's the honest part.

Lines instead of answers

I haven't answered the question. What I have instead is a set of lines I hold while building.

I still have no certainty that I should be building this. But lately I've come to think that, in a domain like this, building carefully without certainty may be safer than building with it.

If you're going through a hard time right now, you don't have to carry it alone. In the US, call or text 988 (Suicide & Crisis Lifeline). Elsewhere, findahelpline.com lists free, confidential services by country.

Should Someone Like Me Be Building a Mental Health App?

The heaviest feature is the least visible one

But does it actually work?

Lines instead of answers

Related Posts

I Built an App to Fix My Hair-Pulling Habit (Not Fixed Yet)

My Home Server Is a Laptop

I'm Building a Sobriety App That Donates My Money If I Fail

Comments

Should Someone Like Me Be Building a Mental Health App?

The heaviest feature is the least visible one

But does it actually work?

Lines instead of answers

Related Posts

I Built an App to Fix My Hair-Pulling Habit (Not Fixed Yet)

My Home Server Is a Laptop

I'm Building a Sobriety App That Donates My Money If I Fail

Comments