None of these were on my to-do list this morning. All three came from other people using things I built and finding the edges. That’s the part of open source I actually like — the bit where someone else’s problem improves your thing.


mcp-fantastical: From 7 Seconds to 100 Milliseconds

An MCP server that lets Claude read and create calendar events through Fantastical. It worked. Slowly.

The original implementation used AppleScript to talk to the calendar. AppleScript is fine for “open this app” automation, but when you’re querying a calendar with hundreds of events from inside an MCP subprocess, you hit two problems. First, AppleScript takes 7-30 seconds to enumerate a large calendar. Second, macOS sandboxing throws permission errors (specifically -1743) when AppleScript runs in certain subprocess contexts — like, say, inside an MCP server spawned by Claude Code.

A contributor (pdurlej) submitted a PR adding a native Swift helper that queries EventKit directly. Same data, no AppleScript intermediary. The result: sub-100ms calendar queries. The AppleScript path is still there as a fallback, so nothing breaks for anyone.

This is the kind of fix I wouldn’t have found myself. My calendar has maybe 30 events in a typical view. Someone with a busier schedule hit the wall I never would have.


mcp-arr: The Bug That Only Breaks at Install Time

An MCP server for the *arr media management suite — Sonarr, Radarr, Lidarr, the whole stack. Two fixes landed today:

The first was a Lidarr search improvement from bndlfm. The search tool was being too rigid about parameter names and returning generic fields instead of music-specific ones. Now it accepts term, query, artist, or name (because everyone phrases it differently) and returns artistName and disambiguation instead of a generic title that tells you nothing useful when you’re looking at music results.

The second was the kind of bug that hides in plain sight. The @modelcontextprotocol/sdk package was listed under devDependencies instead of dependencies. This means nothing during development — the package is installed either way. But when someone installs via npx, devDependencies aren’t included, and the server crashes with ERR_MODULE_NOT_FOUND on first launch.

I never caught it because I always run from the repo. Every user installing it fresh caught it immediately.

Released as v1.5.1.


metric-mandate: What Goodhart Knew About Your KPIs

A Claude Code skill that forces you to define measurable success criteria before starting an AI project. Five questions. No vague answers allowed.

The problem with the original version: you could define a perfectly specific, perfectly measurable metric that was also perfectly gameable. “Reduce ticket resolution time to under 2 hours” is specific and measurable. It also incentivises agents to close tickets without resolving them.

This is Goodhart’s Law: when a measure becomes a target, it ceases to be a good measure. Someone pointed this out in the issues, and they were right.

The skill now adds a sixth step after all five questions are answered. It asks: “How would someone hit this number without actually solving the problem?”

If the gaming vector is obvious, you need a companion metric — one that moves in the opposite direction if the primary metric is gamed, sourced from different data. “Resolution time under 2 hours AND customer satisfaction stays above 4.2/5” is harder to game because tanking one to inflate the other gets caught.

If the gaming vector is hard to identify, the metric is probably robust. Note it and move on.

One companion metric is enough. The point isn’t to build a dashboard of ten numbers that all watch each other. The point is to make sure the one number you chose can’t be hit by accident.

The same update went to both the Claude Code and ChatGPT versions.


None of Them Came From Me

Three updates. A performance fix, a packaging bug, and a conceptual gap — none of them came from me staring at my own code, all of them from someone using it in conditions I didn’t test.

The calendar was fast enough on my machine, the package installed fine from my repo, and the metric skill asked good questions — just not the one that mattered most.

This is the argument for shipping things, even small things, even imperfect things. Not because “done is better than perfect” — that’s a platitude. Because other people’s problems are different from yours, and different problems find different bugs.