The Question Nobody Is Measuring
Last March, my three-year-old daughter asked me to read her the same book for the fourth time in a row. I was sitting on the floor of her room, back against the bed frame, the glow of a hallway nightlight cutting across the carpet. I said yes without checking my phone. That might sound unremarkable. Six months earlier, it would have been impossible. Six months earlier, I would have been at the kitchen table finishing clinic notes at 9 PM, my laptop screen reflecting in the dark window while my kids slept down the hall.
That distinction matters. The technology conversation around AI is almost entirely about doing more -- more output, more efficiency, more throughput. Every article I read about AI in medicine frames it through productivity metrics: notes per hour, documentation time reduced, patients per panel. I want to make the opposite case. The most important thing AI gave me was the ability to do less. And I have been tracking the results the way I would track any intervention: with specificity, over time, measuring outcomes that actually matter.
The Physician Time Tax
To understand what changed, you need to understand what physician schedules actually look like. Arndt and colleagues published a time-motion study in the Annals of Family Medicine (2017) that quantified something every doctor already feels: primary care physicians spend nearly 6 hours per day on EHR documentation. For every 1 hour of direct patient care, physicians spend nearly 2 additional hours on electronic records and desk work.
Then there is what the literature calls "pajama time" -- the documentation that follows you home. Physicians average 1.5 hours per day completing EHR tasks outside of scheduled clinic hours. On days off, that number actually rises to 2.8 hours. I know those numbers are real because I lived them. Sundays meant charting. Tuesday evenings meant finishing notes from a Monday surgical day. My five-year-old son once asked my wife why Daddy was always "working on his computer at the table." She did not have a good answer. Neither did I. That was October 2024. I remember the month because it was the last time he asked. He stopped asking because the answer changed.
The downstream effects are well documented. Shanafelt and colleagues tracked physician burnout over a decade: 48.2% in 2023, declining to 41.9% by 2025, but still staggering for any profession. The Medscape 2024 survey found that 62% of burned-out physicians identified administrative burden as the primary driver -- not clinical complexity, not difficult patients, but paperwork. Dyrbye and Shanafelt's work in BMC Medicine (2016) connected this directly to family life: only about half of physicians reported being "extremely" or "very" satisfied with their marriages, and marital satisfaction correlated inversely with work-related stress.
I used to think the solution was discipline. Wake up earlier. Be more efficient. Protect family time through sheer willpower. That framing was wrong. You cannot willpower your way out of a systemic time deficit. And efficiency, as a value, has a ceiling. At some point you are not being efficient; you are just being fast at something that should not exist in the first place.
What Actually Changed
I started using ChatGPT in late 2025 the way most physicians do -- cautiously, for low-stakes administrative tasks. Drafting professional correspondence. Restructuring policy documents. Condensing 40-page board packets into briefing notes. Nothing clinical. Nothing that required my medical judgment to generate from scratch.
The results were immediate and measurable. Tasks that previously consumed an entire evening -- a letter of recommendation, a committee report, a response to an institutional inquiry -- dropped to 20 or 30 minutes of revision. I estimated roughly five to six hours per week reclaimed from administrative text work alone.
The data on this is emerging but consistent. Kaiser Permanente's deployment of ambient AI scribes across 7,260 physicians and 2.58 million clinical encounters showed savings of approximately 15,791 hours -- roughly one hour per physician per day (Kaiser Permanente, 2025). A study published in JMIR (2025) found that ChatGPT reduced documentation time by approximately 40% for complex supervisory notes. These are not theoretical projections. They are measured outcomes from real clinical environments.
But here is what none of the efficiency literature captures, and what I think is the actual story: the moment I realized I had not opened my laptop after dinner in three consecutive weeks. My wife mentioned it first. It was a Thursday in January 2026. I was on the living room floor helping our son sort LEGOs by color. She walked through the room, paused, and said, "You're actually here." She did not mean it as a compliment on my productivity. She meant I was physically and mentally present in a way I had not been in months. That sentence landed differently than any time-savings metric. It told me something the Kaiser data cannot: the unit of measurement for AI's value is not hours reclaimed. It is what fills those hours afterward.
The Anti-Productivity Thesis
Most of the AI discourse in healthcare frames these tools through the lens of output: see more patients, process more claims, generate more documentation. I want to explicitly argue against that framing, because I think it represents a fundamental misunderstanding of what these tools are actually worth.
If you use AI to reclaim five hours and then fill those five hours with more work, you have gained nothing. You have just compressed the same burnout into a tighter loop. The history of health IT is littered with tools that promised efficiency and delivered the opposite -- EHRs being the most obvious example. Wachter's The Digital Doctor (2015) documented exactly this pattern: technologies introduced to save time were captured by institutions and converted into additional throughput demands. The physician who uses ChatGPT to write notes faster and then takes on a heavier patient panel has not solved the problem. They have accelerated toward it.
We demand evidence for medical interventions. We run randomized controlled trials. We insist on outcome data before changing practice. But the central promise of AI -- "it will give you back your time" -- gets a pass. Nobody asks for the outcome data. Nobody measures what happens downstream of the saved hours. Did the physician sleep more? See their children more? Experience less burnout? Stay in the profession longer? Those are the outcomes that matter, and they are almost entirely absent from the literature.
I made a deliberate decision: the time AI gave back would go to my family, not to my employer. That meant leaving the office when the clinical work was done rather than staying to get ahead on administrative tasks. It meant closing the laptop when the kids came home from daycare. It meant choosing presence over productivity, which is a harder choice than it sounds when your entire professional formation taught you that more hours equals more commitment equals more value. I had to unlearn the equation. The AI did not teach me that. It just created the space where unlearning was possible.
Eleven Months of Data
Because I think about this the way I think about any clinical question -- show me the evidence -- I started tracking. Not formally, not with validated instruments, but consistently. Starting in January 2026, I logged three things each week: how many weekday evenings I was fully present with my family from dinner through bedtime, whether I opened my laptop after the kids came home, and one qualitative note about a specific family moment.
Before AI tools, the answer to the first question was typically two, maybe three evenings per week. The other evenings involved at least an hour of documentation or correspondence, usually after the kids were asleep but sometimes during. By February, it was consistently five. By April, some weeks all seven.
The qualitative notes are what I actually care about. February 8: my son asked me to help him build a "rocket ship" out of couch cushions, and I did, for forty-five minutes, without once thinking about the committee report I would normally have been writing. March 14: my daughter fell asleep on my chest on the couch after dinner. I sat there for an hour, not moving, listening to her breathing slow against the hum of the dishwasher. Before AI tools, that hour would have been documentation time. I would have moved her to bed and opened my laptop. Instead, I sat there until my wife came in and took a photo. I still have it.
April 23: my son's preschool had a "bring your parent to class" morning. I went. A year earlier, I would have declined because Wednesday mornings were chart-catch-up time. I watched him show me the block tower he had been building all week. He introduced me to his friends by saying, "This is my dad, he's a doctor, he helps people." He did not say, "This is my dad, he's always on his computer." That shift -- the way my children describe me to other people -- is an outcome metric that no AI benchmark captures.
Last Wednesday, my daughter climbed into my lap while I was sitting on the couch after dinner -- not working, just sitting -- and said, "Daddy, you're not busy." She said it with genuine surprise. That sentence will stay with me longer than any efficiency metric. She is three. She should not be surprised that her father is available. The fact that she was tells me something about what the previous baseline looked like from her height.
What I Will Not Use AI For
Boundaries matter here. I do not use ChatGPT to write clinical notes that bear my signature without thorough revision. I do not use it for medical decision-making. I do not use it to draft anything where the recipient deserves my unmediated voice -- condolence letters to patients' families, personal recommendations for colleagues I know well, communication with my own children's school.
There is a category of work that should remain slow because the slowness is the point. Writing a heartfelt letter takes time precisely because the time is what makes it heartfelt. AI is excellent at eliminating friction from tasks where friction adds no value. It has no place in the tasks where friction is the value.
I apply the same logic to family time. I do not use AI to "optimize" parenting -- no ChatGPT-generated activity schedules, no AI-curated educational content, no algorithmic suggestions for how to spend Saturday morning. The whole point of reclaiming those hours is to fill them with something unstructured, unoptimized, and fully human. My daughter does not need an optimized bedtime routine. She needs me on the floor, reading the same book again, present enough to do the voices differently each time.
The Outcome Question
The researchers who study physician well-being measure burnout in validated scales and survey instruments. I measure it differently now. I measure it in the sound of my kids laughing in the next room while I am actually in the next room, not absent behind a screen. I measure it in the weight of a three-year-old falling asleep on my chest, in my son's unsurprised expectation that I will be at his school event, in my wife's observation that I am "actually here" -- delivered not as praise but as a factual description of something that used to be rare.
Dyrbye and Shanafelt found that only 50% of physicians report high marital satisfaction. I do not know where I would have scored on that scale eighteen months ago. I know where I would score now. I know that my wife and I have had more uninterrupted conversations in the past eleven months than in the two years before. I know that the quality of those conversations changed when I stopped arriving at them pre-depleted by two hours of evening documentation. That is not a controlled experiment. It is a case study with an n of one. But if the AI industry is going to claim that these tools improve lives, someone should be measuring whether that is true at the level where life actually happens -- not in hours saved, but in what those hours became.
The models will keep improving. The capabilities will keep expanding. The question that matters is not what AI can do. It is what you will do with the hours it returns. I know my answer. I will be on the floor, reading the same book for the fourth time, saying yes again. Not because it is productive. Because it is the point.
Frequently Asked Questions
How much time can physicians realistically save using AI tools like ChatGPT?
Current evidence suggests meaningful savings. Kaiser Permanente's 2025 ambient AI deployment across 7,260 physicians saved approximately one hour per physician per day. A JMIR (2025) study found ChatGPT reduced complex documentation time by roughly 40%. In my own experience, administrative text work dropped from six-plus hours weekly to about two hours -- a net gain of four to five hours that I redirect entirely to family time rather than additional work.
Does using AI for administrative tasks actually reduce physician burnout, or just shift it?
The outcome depends entirely on what physicians do with reclaimed time. The Medscape 2024 burnout survey found 62% of physicians identify administrative burden as their primary burnout driver, so reducing that burden should help. But if reclaimed hours are filled with more clinical volume, the benefit disappears. Shanafelt's longitudinal data shows burnout declining from 48.2% (2023) to 41.9% (2025), coinciding with broader AI adoption, though causation is difficult to isolate. The critical variable is whether physicians and their employers protect the time AI frees -- and whether we start measuring downstream human outcomes, not just hours saved.
What tasks should physicians avoid delegating to AI?
Any task where the physician's unmediated judgment, voice, or emotional investment is the core value. Clinical decision-making, documents requiring personal attestation without thorough review, and communications where the effort of writing is itself meaningful -- condolence letters, personal references, sensitive patient communications. AI excels at eliminating low-value friction from administrative scaffolding. It should not replace the work where slowness and personal attention are the point.
How does AI-assisted time recovery affect physician family relationships specifically?
Research by Dyrbye and Shanafelt in BMC Medicine (2016) found that only about 50% of physicians report high marital satisfaction, with lower satisfaction correlating directly to higher work-related stress. The mechanism is straightforward: administrative "pajama time" -- averaging 1.5 hours daily after clinic hours -- directly competes with family presence. In my case, eliminating most evening documentation shifted my available weekday family evenings from two or three per week to consistently five or more. I tracked this over eleven months. The change was visible enough that both my wife and my children noticed independently -- and the way my children describe me to others shifted from "he's always on his computer" to introducing me at school without any reference to absence.