This morning I was sitting in the opening session for a homeschooling conference. I was taking notes on my laptop in Obsidian, and I wrote the first markdown checkbox of the day to create a to-do for something the speaker had just said.
As soon as I did that, I realized I was probably not going to come back and do it. I was probably going to forget about it.
So I opened Discord, gave Hermes a prompt, and recorded the to-do in Todoist instead of burying it in the Obsidian file.
That sounds like the beginning of a distraction.
It was the opposite.
I came back to that same thread a few times during the day. Probably less than 5 minutes total, broken up across the day, over a spotty internet connection, while I was trying to listen, take notes, and not get pulled out of the room mentally.
The first version was simple: I had a homeschool Todoist project, I was taking conference notes in Obsidian, and I wanted Hermes to watch the tasks I was throwing in there. If I had a question to research later, I did not want to bury it in a note and hope I remembered to pull it out that night.
So I prompted Hermes from Discord and asked it to monitor the project.
Later, with a couple of follow-up prompts, I asked it to start looking into the audio processing side too — watch for recordings, transcribe them, and write the results back into Obsidian as derivative notes.
That was basically it.
The normal failure mode
My normal conference failure mode is not lack of interest. It is capture debt.
I hear something useful. I write half a sentence. I add a task. I record a session. I think, “I’ll clean this up later.”
Then later comes, and now I have to do all the work again:
- pull to-dos out of notes
- remember why I cared
- research the thing I meant to research
- split one vague task into five real follow-ups
- find the recording
- transcribe it
- put the transcript somewhere useful
The annoying thing is that none of those steps is individually hard. They are just enough friction that the work starts leaking.
This setup was different because the follow-up loop started while the sessions were still happening.
That meant I did not have to choose between being present now and doing something useful later.
What I actually asked for
The setup started as a Todoist watcher.
Every 10 minutes, Hermes checks the homeschool Todoist project, looks for tasks I added on the fly, reads the live conference notes in Obsidian, and tries to do the next helpful thing. Sometimes that means researching the question and adding the answer back to the Todoist description. Sometimes that means creating subtasks. Sometimes that means flagging something that needs my judgment instead of pretending it can decide.
The second piece was the audio pipeline.
That one is more mechanical. It scans the same task/project context plus recent conference notes for likely recording links. If it finds one, it downloads or copies the audio, compresses it to mono 16 kHz / 48 kbps MP3 with ffmpeg, sends it to the transcription service running on my M5 Mac, and writes a derivative note back into Obsidian.
If it finds nothing, it says nothing.
That last part matters. A background system that tells me every 10 minutes that nothing happened is not an assistant. It is a toddler.
The actual value was not Swamp by itself
This is not really a post about having a revelation about Swamp.
Swamp mattered. Todoist mattered. Obsidian mattered. The M5 Mac transcription service mattered. Tailscale mattered. Discord mattered.
But the real value was that the toolkit was already there.
Hermes was running at home in my homelab, behind Tailscale. My laptop at the conference did not need to do the heavy lifting. It was just the machine I was using to record, take notes, and send quick Discord messages. The compute, API calls, vault access, Todoist access, and transcription handoff could happen somewhere else.
That meant a spotty conference internet connection was enough.
I did not need to sit there and build a system from scratch. I did not need to open five dashboards. I did not need to stop listening so I could organize my thoughts perfectly.
I could just fire off: look into this, track that, process these recordings, keep an eye on this project.
And then go back to listening.
The homeschool app that appeared in the middle of the day
The other thing that happened in that first session was less polished, but maybe more interesting.
I had a practical homeschool problem: attendance, subject tracking, and the little bits of educational evidence that are easy to intend to record and hard to actually record. So I asked Hermes to research whether there was already a good homelab-friendly Docker container for this kind of thing.
It went looking.
Then it started building the work plan. Not in a separate “AI project management” tool. In Todoist, where I already live.
It created prompts for Claude Code as Todoist tasks. Those tasks became the handoff point for setting things up on the homeschool computer over SSH and tmux.
That sounds like a lot of machinery when I write it out.
But in the moment it was mostly me jotting down thoughts between conference notes.
I was listening to the session. I was transcribing in the background. I was taking notes on my own reactions instead of trying to capture every sentence from the stage. And every once in a while I would dump a small instruction into Hermes:
Research this.
Turn it into tasks.
Set it up over there.
Build the model around it.
By the end of that first session, I had the beginning of a Docker-hosted homeschool tracking app, plus Swamp models to support agentic recording of attendance and subject study time.
Not because I stopped paying attention to the conference.
Because I had a background system that could catch the thread when I handed it off.
And honestly, that is the part that smells like the future to me. Not “AI writes everything.” More like: my wife should eventually be able to talk into a simple wrapper app and say, “We did math for forty-five minutes, read two chapters, and went outside for nature study,” and the system should know how to turn that into the right records.
Not a dashboard she has to maintain.
A household memory layer that accepts normal human input.
The question I could ask because the notes already existed
There was another moment during a talk where this clicked for me.
The speaker mentioned math, and I had a very specific question:
What age did Joshua Sheats recommend starting formal math?
That is not really a Google question for me anymore.
A lot of my Snipd podcast highlights end up in my Obsidian vault. Radical Personal Finance has been influential enough in our homeschool thinking that I have already had Hermes crawl those notes, index Joshua’s homeschool series, and turn parts of his philosophy into reusable skills.
So during the talk, I asked Hermes.
A few seconds later I had the answer, the context around it, and a link back to the original episode note. Then I asked Hermes to add it as a callout in the Obsidian note I was actively taking.
That matters because it changed the kind of note I could take.
I was not writing, “speaker said something about math.” I was connecting the talk to an already-developed body of thought in my own vault.
Sometimes Hermes can even push back with something like:
Joshua would probably mention this here. You may be fixing the wrong thing.
That is a very different kind of tool.
It is not just recall. It is a personal context engine with enough history to be useful at the moment of thought.
The loop
The useful thing here is not that I have a “conference bot.” I have a loop:
- I capture live questions in Todoist.
- I take messy notes in Obsidian.
- Hermes checks both in the background.
- The research tasks start getting answered while the context is still fresh.
- Follow-up subtasks get added automatically.
- Existing notes and podcast highlights can be searched while I am still thinking.
- The audio pipeline watches for recordings.
- The M5 Mac transcribes them.
- The results come back into the places I already use.
Several of the to-dos basically started solving themselves before the sessions were even finished.
That is the difference.
Normally I would leave the conference with a pile of notes and a vague sense that I needed to process them. This time, the processing had already started. Not perfectly. Not completely. But enough that I was not starting from zero at the end of the day.
More importantly, I was not using my attention to hold all the loose threads together.
I could listen to the speaker, notice the idea, send the thread somewhere useful, and come back.
The Obsidian rule: canonical notes stay canonical
One of the better decisions in this setup is that generated notes do not overwrite my original notes.
My live conference notes are the source. My Todoist tasks are also source material. The transcription output is derivative.
That means the pipeline writes into an event-specific Processed Audio/ folder.
Those generated notes include frontmatter with the original source, the audio URL, and the transcription timestamp. The source task or source note stays intact.
This is one of those small rules that prevents a lot of future pain. If an automated transcript is bad, I can delete the derivative note. If the summary misses something, I can regenerate it. But my original hand notes remain mine.
That is part of why the system helped rather than distracted me. It did not compete with my live notes. It supported them.
The annoying parts are the important parts
The hard part was not “call an LLM.” That is almost never the hard part anymore.
The hard parts were things like:
- the conference folder might have the expected path, or it might not exist yet
- dictation might create a slightly different folder name than the one I intended
- a government PDF might have the word “download” near it, but that does not make it an audio recording
- a media link might be redacted, which means the pipeline should report it once and stop retrying forever
- the background job should only mark a recording processed after the derivative note exists
- the no-op path should be silent
- Hermes-created Obsidian files need the right permissions so Android sync does not fall over
That is where the actual automation lives. Not in the impressive demo path, but in the boring edge cases that decide whether I trust it tomorrow.
And trust matters here because the point is not novelty. The point is attention.
If I do not trust the system, I have to babysit it. If I have to babysit it, I am not present in the room anymore.
What I want to reuse
This setup is specific, but the pattern is not.
Any live event could use the same shape:
- one project or inbox for questions
- one notes folder for live capture
- one deterministic watcher for recordings
- one agentic watcher for research and follow-up
- one personal-knowledge search path for prior notes, highlights, and skills
- one derivative-notes folder for generated artifacts
- one state file so the system knows what it already processed
That could work for a conference, a sermon series, a work offsite, a homeschool planning week, or even a long research project.
The next version should be less event-specific. The event name, Todoist project, Obsidian folders, transcription prompt, output folder, and prior-knowledge sources should be configuration — probably a Swamp model instead of a hand-shaped Python script.
But the bigger lesson is not “make a model.”
The bigger lesson is that the best automation did not start with me sitting down to automate. It started with me living the moment, noticing the friction, and having enough tools ready that a two-minute Discord prompt could turn into a working loop.
I was not less present because the automation was running. I was more free to notice what I actually thought.