<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Lab Notes</title>
  <link href="https://blog.parenstech.com/atom.xml" rel="self"/>
  <link href="https://blog.parenstech.com"/>
  <updated>2026-05-12T23:24:52+00:00</updated>
  <id>https://blog.parenstech.com</id>
  <author>
    <name>Yenda</name>
  </author>
  <entry>
    <id>https://blog.parenstech.com/2026-01-01-clojure-datastar-experiment.html</id>
    <link href="https://blog.parenstech.com/2026-01-01-clojure-datastar-experiment.html"/>
    <title>The Clojure Datastar Experiment: When Language Loyalty Becomes a Trap</title>
    <updated>2026-01-01T23:59:59+00:00</updated>
    <content type="html"><![CDATA[<div><p>I spent weeks building what I thought would be the obvious next step: a Clojure-native version of <a href="https://data-star.dev/">Datastar</a>. Same architecture, but with Clojure expressions instead of JavaScript, Transit instead of JSON, hiccup helpers instead of raw HTML attributes.</p><p>It was a waste of time. Here&apos;s why.</p><h2 id="the-siren-song">The Siren Song</h2><p>Datastar is a ~10KB JavaScript library for server-driven UI. The server owns all state. The client is a thin rendering layer. Click a button, POST to the server, receive SSE updates, morph the DOM. No React, no Redux, no client-side state management.</p><p>For a Clojure developer, there&apos;s a natural next question: <em>What if we had this, but in Clojure?</em></p><p>The pitch writes itself:</p><ul><li>Clojure expressions instead of JavaScript (<code>(toggle! :_open)</code> instead of <code>$_open = !$_open</code>)</li><li>Transit for richer data types</li><li>Hiccup helpers for ergonomic server rendering</li><li>ClojureScript on the client for consistency</li></ul><p>I called the experiment <a href="https://github.com/parenstech/aleth">Aleth</a> and got to work.</p><h2 id="what-i-built">What I Built</h2><p>The core took about 2,000 lines across client and server modules:</p><pre><code>Client modules:
  core.cljs    482 lines  (bindings, discovery)
  eval.cljs    308 lines  (expression evaluator)
  local.cljs   229 lines  (local signals)
  signals.cljs 164 lines  (signal store)
  sse.cljs     194 lines  (SSE client)
  morph.cljs   160 lines  (DOM morphing wrapper)

Server modules:
  core.clj     117 lines
  hiccup.clj   290 lines
  sse.clj      211 lines

Total: 2,155 lines
</code></pre><p>The expression evaluator alone - allowing Clojure syntax in DOM attributes - required:</p><ul><li>A whitelist of ~50 safe functions</li><li>Special form handling (if, when, let, do, and, or, cond)</li><li>Symbol resolution against signal maps</li><li>Reactive expression watching</li></ul><p>When the &quot;server owns truth&quot; principle made simple UI patterns sluggish (every dropdown toggle required a server round-trip), I implemented local signals - Datastar&apos;s solution to the same problem:</p><pre><code class="language-clojure">;; Aleth&apos;s attempted local signals
[:div (a/local {:_open false})
 [:button (a/on-local :click &apos;(toggle! :_open)) &quot;Toggle&quot;]
 [:div (a/show-expr &apos;_open) &quot;Content&quot;]]
</code></pre><p>Four hundred lines later, it worked. I felt accomplished.</p><p>Then I looked at the bundle size.</p><h2 id="the-numbers-don&apos;t-lie">The Numbers Don&apos;t Lie</h2><table><thead><tr><th>Library</th><th>Size (gzipped)</th><th>Ratio</th></tr></thead><tbody><tr><td>Datastar</td><td>~10.76 KB</td><td>1x</td></tr><tr><td>Aleth</td><td>80 KB</td><td>7.4x</td></tr></tbody></table><p>The gap is structural, not fixable. Aleth includes:</p><ul><li>ClojureScript core (~50KB alone)</li><li>Transit encoding/decoding</li><li>cljs.reader for parsing expressions</li><li>Custom evaluator</li><li><a href="https://github.com/metosin/malli">Malli</a> schemas</li></ul><p>Even stripping everything optional, ClojureScript&apos;s baseline makes parity impossible.</p><h2 id="the-fundamental-problem">The Fundamental Problem</h2><p>Every Aleth expression goes through:</p><pre><code>String -&gt; cljs.reader/read-string -&gt; AST -&gt; tree-walk evaluation -&gt; result
</code></pre><p>Every click, every reactive update, every signal change pays this parsing tax. I&apos;m interpreting an interpreter.</p><p>Datastar&apos;s expressions are native JavaScript:</p><pre><code class="language-javascript">$_open = !$_open
</code></pre><p>Evaluated by the browser&apos;s JavaScript engine via <code>Function()</code> constructor. Zero parsing overhead. Battle-tested. Every edge case handled by decades of browser development.</p><p><strong>You cannot beat native JavaScript at being JavaScript.</strong></p><p>This should have been obvious from the start. I was so focused on the elegance of unified syntax that I ignored the fundamental constraint: the browser already has an expression language. It&apos;s optimized. It works. Adding a layer on top is pure overhead.</p><h2 id="the-honest-comparison">The Honest Comparison</h2><p>When I forced myself to answer &quot;What does Aleth offer over Datastar?&quot;, the answer was deflating:</p><table><thead><tr><th>Aspect</th><th>Aleth</th><th>Datastar</th></tr></thead><tbody><tr><td>DOM morphing</td><td>Idiomorph</td><td>Idiomorph (same)</td></tr><tr><td>SSE protocol</td><td>Custom events</td><td>Custom events (same)</td></tr><tr><td>Declarative attributes</td><td>Yes</td><td>Yes (same)</td></tr><tr><td>Local signals</td><td>Yes</td><td>Yes (same)</td></tr><tr><td>Bundle size</td><td>80KB</td><td>10KB</td></tr><tr><td>Expression parsing</td><td>Custom interpreter</td><td>Native browser</td></tr><tr><td>Community</td><td>Just me</td><td>Growing ecosystem</td></tr><tr><td>Backend SDKs</td><td>Clojure only</td><td>Go, Python, PHP, Java, etc.</td></tr></tbody></table><p>The differentiator is &quot;Clojure syntax for expressions.&quot; That&apos;s it. And that differentiator adds complexity, size, and overhead without benefiting users.</p><h2 id="the-trap-pattern">The Trap Pattern</h2><p>I fell into a trap I&apos;ve seen before. Call it &quot;language loyalty syndrome.&quot;</p><p>The pattern:</p><ol start="1"><li>Discover a tool that works well</li><li>Notice it&apos;s not in your preferred language</li><li>Conclude the solution is to rewrite it</li><li>Spend weeks reimplementing what already exists</li><li>End up with a worse version that you now have to maintain</li></ol><p>The justification sounds reasonable: &quot;We&apos;ll have Clojure all the way down!&quot; But the justification conflates two different things:</p><ul><li><strong>Server code</strong>, where language choice matters (you write a lot of it, it&apos;s complex, types and tooling matter)</li><li><strong>Client expressions</strong>, which are trivial one-liners (<code>$count++</code>, <code>$_open = !$_open</code>)</li></ul><p>Nobody writes complex logic in Datastar expressions. They&apos;re not meant for that. The server handles complexity. The client handles <code>$visible = true</code>.</p><p>Optimizing for &quot;Clojure syntax&quot; in the client is optimizing for something that doesn&apos;t need optimization.</p><h2 id="what-i-should-have-built">What I Should Have Built</h2><p>The valuable part of Aleth is the server side:</p><pre><code class="language-clojure">;; This is actually useful
(defn counter-view [count]
  [:div {:data-signals (json/encode {:count count})}
   [:button {:data-on-click &quot;$count++&quot;} &quot;+&quot;]
   [:span {:data-text &quot;$count&quot;} count]])

;; SSE helpers are useful
(a/sse-response
  (fn [sse]
    (a/patch! sse &quot;#counter&quot; (counter-view new-count))))
</code></pre><p>A Clojure SDK for Datastar would be:</p><ul><li>Hiccup helpers that emit Datastar-compatible attributes</li><li>Ring middleware for SSE responses</li><li>Transit encoding if you want richer data types</li></ul><p>Use Datastar&apos;s 10KB client as-is. Don&apos;t rewrite it. Don&apos;t wrap it. Include the CDN script and move on.</p><h2 id="the-lesson">The Lesson</h2><p>Before rewriting an existing tool in your preferred language, ask:</p><ol start="1"><li><strong>Where is the value?</strong> (Server logic vs. client expressions)</li><li><strong>What am I actually gaining?</strong> (Syntax consistency? Is that worth 8x bundle size?)</li><li><strong>What is the maintenance cost?</strong> (Tracking upstream changes, fixing edge cases, security audits)</li><li><strong>Who benefits?</strong> (You as the developer, or actual users?)</li></ol><p>The value of server-driven UI is in the <em>architecture</em>, not the <em>syntax</em>. Datastar already nailed the architecture. Wrapping it in Clojure syntax adds complexity without improving the architecture.</p><h2 id="the-hard-part">The Hard Part</h2><p>Abandoning the experiment was harder than I expected. I had working code. I had solved interesting problems. The expression evaluator was elegant in its way.</p><p>But &quot;I built a thing&quot; is not the same as &quot;I built a thing worth using.&quot;</p><p>The honest answer to &quot;what if Datastar, but in Clojure?&quot; is: &quot;Use Datastar. Write your server in Clojure. The expressions on the client are JavaScript, and that&apos;s fine.&quot;</p><p>Sometimes the right answer is to not build the thing.</p><hr /><p><em>The Aleth experiment is preserved at <a href="https://github.com/parenstech/aleth">github.com/parenstech/aleth</a>. The server-side SDK approach - hiccup helpers and SSE middleware for Datastar - is what I&apos;ll build next.</em></p></div>]]></content>
  </entry>
  <entry>
    <id>https://blog.parenstech.com/2025-12-31-aleth-server-driven-ui.html</id>
    <link href="https://blog.parenstech.com/2025-12-31-aleth-server-driven-ui.html"/>
    <title>Aleth: Server-Driven UI Without Client Lies</title>
    <updated>2025-12-31T23:59:59+00:00</updated>
    <content type="html"><![CDATA[<div><p>The ancient Greeks had a word for truth: <em>aletheia</em> - literally &quot;un-concealment.&quot; Truth wasn&apos;t something you asserted; it was something you revealed by stripping away what hid it.</p><p>This is the premise behind <a href="https://github.com/parenstech/aleth">Aleth</a>, a new server-driven UI library for Clojure. The core bet: <strong>no client-side computation</strong>. The server owns all truth. The client is a pure projection - it shows what it&apos;s told, nothing more.</p><p>In an age of increasingly complex frontend frameworks, this sounds almost naive. But there&apos;s a specific context where it makes profound sense: systems where humans supervise and AI writes the code.</p><h2 id="the-architectural-bet">The Architectural Bet</h2><p>Modern web UIs are distributed systems hiding inside your browser. State lives in Redux stores, component local state, URL parameters, localStorage, and server databases. When something goes wrong, you triangulate across all of them.</p><p>Aleth eliminates this by making a radical constraint: the client cannot compute.</p><p>Where <a href="https://data-star.dev/">Datastar</a> allows <code>$count + 1</code> in attributes, Aleth has no expressions. Where React components derive state locally, Aleth demands the server send exactly what to display. The client becomes a terminal - it receives instructions and renders them.</p><pre><code class="language-clojure">;; Server computes everything
(defn increment-handler [req]
  (sse-response
    (fn [sse]
      (let [{:keys [count]} (signals req)
            new-count (inc count)]
        (signals! sse {:count new-count})
        (patch! sse &quot;#counter&quot; [:div {:id &quot;counter&quot;} [:span (str new-count)]])
        (close! sse)))))
</code></pre><p>The client receives two things: a signal update (<code>{:count 6}</code>) and a DOM patch. It applies both. No logic, no decisions, no opportunities to diverge from truth.</p><h2 id="how-it-works">How It Works</h2><p>Aleth uses Transit+JSON over Server-Sent Events. The wire protocol has three operations:</p><ul><li><strong>patch</strong> - Update DOM via hiccup</li><li><strong>signals</strong> - Update reactive client state</li><li><strong>execute</strong> - Run JavaScript (the escape hatch, discussed later)</li></ul><p>The server sends hiccup, the client morphs it into the DOM using <a href="https://github.com/bigskysoftware/idiomorph">Idiomorph</a>. Signal changes trigger reactive bindings. That&apos;s the entire runtime.</p><pre><code class="language-clojure">;; Server renders initial page
(defn counter-page [count]
  [:html
   [:body
    [:div (a/signals {:count count})
     [:span (a/text :count) (str count)]
     [:button (a/action &quot;/increment&quot;) &quot;+&quot;]
     [:button (a/action &quot;/decrement&quot;) &quot;-&quot;]]]])
</code></pre><p>The <code>a/signals</code>, <code>a/text</code>, and <code>a/action</code> helpers emit <code>data-*</code> attributes. When Aleth&apos;s JavaScript loads, it discovers these attributes and wires them up. Click the button, POST to <code>/increment</code>, receive SSE response, update DOM. The HTML works before JavaScript loads; Aleth progressively enhances it.</p><h2 id="what&apos;s-good-about-this">What&apos;s Good About This</h2><p><strong>Determinism.</strong> <code>state -&gt; UI</code> is a pure function. Same signals, same render. No &quot;it works if you refresh,&quot; no race conditions between client and server state.</p><p><strong>Testability.</strong> You can property-test your entire UI:</p><pre><code class="language-clojure">(defspec render-is-deterministic 100
  (prop/for-all [state (mg/generator signals-schema)]
    (= (render state) (render state))))
</code></pre><p>Visual regression testing becomes trivial - render HTML, snapshot, compare. No client timing issues, no flaky tests.</p><p><strong>Observability.</strong> Everything is inspectable. Aleth includes devtools (SSE inspector, signal viewer, schema panel) that show every state transition. Debug by reading the event stream, not by reproducing timing-dependent bugs.</p><p><strong>Schema validation.</strong> <a href="https://github.com/metosin/malli">Malli</a> validates signals on both ends. Invalid states are rejected, not silently accepted.</p><p><strong>Clean API.</strong> The library exports a single entry point with sensible helpers. Hot reload works correctly (using WeakSet to prevent duplicate bindings). The devtools use Shadow DOM for style isolation.</p><h2 id="what&apos;s-concerning">What&apos;s Concerning</h2><p>This is an early-stage library with some serious issues.</p><p><strong>No tests.</strong> For a library whose core value proposition is correctness and determinism, this is ironic. The spec includes property-based test examples, but the implementation has none.</p><p><strong>Memory leaks.</strong> Signal watchers are never cleaned up. In a long-running session, this will accumulate. Multiple locations in the codebase add watchers without corresponding removal logic.</p><p><strong>The <code>execute</code> escape hatch.</strong> The wire protocol includes an <code>execute</code> operation that runs arbitrary JavaScript via <code>js/eval</code>. This is a security risk in any production system, even if intended only for debugging. It&apos;s the kind of backdoor that gets forgotten about.</p><p><strong>No recovery after SSE failure.</strong> The connection has retry logic with exponential backoff, but there&apos;s no recovery mechanism once retries exhaust. The client just stops.</p><p><strong>Stale DOM references.</strong> After morphing, references to old DOM nodes aren&apos;t invalidated. This can cause silent failures in bindings.</p><p><strong>URL injection.</strong> The redirect handler doesn&apos;t sanitize URLs, creating a potential vector for malicious redirects.</p><h2 id="who-should-consider-this">Who Should Consider This</h2><p>Aleth is right for:</p><ul><li><strong>Admin panels and internal tools</strong> - where latency to the server is low and instant UI response isn&apos;t critical</li><li><strong>AI-supervised development</strong> - where you want the simplest possible model for an LLM to reason about</li><li><strong>Forms-heavy applications</strong> - where most interactions are &quot;submit and wait for server response&quot;</li><li><strong>Dashboards with real-time data</strong> - the SSE broadcast pattern handles this cleanly</li></ul><p>Aleth is wrong for:</p><ul><li><strong>Consumer applications</strong> requiring instant responsiveness</li><li><strong>Offline-first or PWA</strong> - the library explicitly doesn&apos;t support offline (server owns truth)</li><li><strong>Complex interactions</strong> like drag-and-drop, real-time drawing, gaming</li><li><strong>Production use</strong> - the current implementation has too many gaps</li></ul><p>The spec is honest about this: &quot;For offline-first, consider <a href="https://fulcro.fulcrologic.com/">Fulcro</a>.&quot;</p><h2 id="the-latency-trade-off">The Latency Trade-off</h2><p>Every click goes to the server and back. On a local network, this is imperceptible. Over a 200ms round-trip, it&apos;s noticeable. The library&apos;s answer is &quot;optimize the server,&quot; not &quot;add client computation.&quot;</p><p>This is philosophically consistent but practically limiting. There are interactions where even 50ms of latency feels broken - typing in a search box, dragging to reorder a list, hovering to preview. Aleth doesn&apos;t try to solve these cases.</p><p>The comparison to <a href="https://hexdocs.pm/phoenix_live_view/">Phoenix LiveView</a> is instructive. LiveView makes the same server-centric bet but in Elixir&apos;s ecosystem where lightweight processes and low-latency WebSockets are first-class. Aleth is swimming upstream against browser realities.</p><h2 id="conclusion">Conclusion</h2><p>Aleth represents an interesting point in the design space: what if we maximally simplified the client at the cost of server round-trips? For the right use cases - internal tools, AI-assisted development, admin interfaces - this trade-off makes sense.</p><p>The ideas are sound. The implementation needs work.</p><p>If you&apos;re building something in the sweet spot (low-latency server, forms-heavy workflow, prioritizing correctness over responsiveness), Aleth is worth watching. If you need production-ready today, wait for the tests, the memory leak fixes, and the security audit.</p><p>The name promises truth as unconcealment. The library isn&apos;t there yet - but the architecture points in an interesting direction.</p></div>]]></content>
  </entry>
  <entry>
    <id>https://blog.parenstech.com/2025-12-30-building-heretic.html</id>
    <link href="https://blog.parenstech.com/2025-12-30-building-heretic.html"/>
    <title>Building Heretic: From ClojureStorm to Mutant Schemata</title>
    <updated>2025-12-30T23:59:59+00:00</updated>
    <content type="html"><![CDATA[<div><img src="/img/heretic-logo.webp" alt="Heretic" width="50%"><p><em>This is Part 2 of a series on mutation testing in Clojure. <a href="/2025-12-28-heretic-mutation-testing.html">Part 1</a> introduced the concept and why Clojure needed a purpose-built tool.</em></p><p>The previous post made a claim: mutation testing can be fast if you know which tests to run. This post shows how <a href="https://github.com/parenstech/heretic">Heretic</a> makes that happen.</p><p>We&apos;ll walk through the three core phases: collecting expression-level coverage with <a href="https://github.com/flow-storm/clojure">ClojureStorm</a>, transforming source code with <a href="https://github.com/clj-commons/rewrite-clj">rewrite-clj</a>, and the optimization techniques that keep mutation counts manageable.</p><h2 id="phase-1:-coverage-collection">Phase 1: Coverage Collection</h2><p>Traditional coverage tools track lines. Heretic tracks expressions.</p><p>The difference matters. Consider:</p><pre><code class="language-clojure">(defn process-order [order]
  (if (&gt; (:quantity order) 10)
    (* (:price order) 0.9)    ;; &lt;- Line 3: bulk discount
    (:price order)))
</code></pre><p>Line-level coverage would show line 3 as &quot;covered&quot; if any test enters the bulk discount branch. But expression-level coverage distinguishes between tests that evaluate <code>*</code>, <code>(:price order)</code>, and <code>0.9</code>. When we later mutate <code>0.9</code> to <code>1.1</code>, we can run only the tests that actually touched that specific literal - not every test that happened to call <code>process-order</code>.</p><h3 id="clojurestorm&apos;s-instrumented-compiler">ClojureStorm&apos;s Instrumented Compiler</h3><p><a href="https://github.com/flow-storm/clojure">ClojureStorm</a> is a fork of the Clojure compiler that instruments every expression during compilation. Created by <a href="https://github.com/jpmonettas">Juan Monetta</a> for the <a href="https://github.com/flow-storm/flow-storm-debugger">FlowStorm</a> debugger, it provides exactly the hooks Heretic needs. (Thanks to Juan for building such a solid foundation - Heretic would not exist without ClojureStorm.)</p><p>The integration is surprisingly minimal:</p><pre><code class="language-clojure">(ns heretic.tracer
  (:import [clojure.storm Emitter Tracer]))

(def ^:private current-coverage
  &quot;Atom of {form-id #{coords}} for the currently running test.&quot;
  (atom {}))

(defn record-hit! [form-id coord]
  (swap! current-coverage
         update form-id
         (fnil conj #{})
         coord))

(defn init! []
  ;; Configure what gets instrumented
  (Emitter/setInstrumentationEnable true)
  (Emitter/setFnReturnInstrumentationEnable true)
  (Emitter/setExprInstrumentationEnable true)

  ;; Set up callbacks
  (Tracer/setTraceFnsCallbacks
   {:trace-expr-fn (fn [_ _ coord form-id]
                     (record-hit! form-id coord))
    :trace-fn-return-fn (fn [_ _ coord form-id]
                          (record-hit! form-id coord))}))
</code></pre><p>When any instrumented expression evaluates, ClojureStorm calls our callback with two pieces of information:</p><ul><li><strong>form-id</strong>: A unique identifier for the top-level form (e.g., an entire <code>defn</code>)</li><li><strong>coord</strong>: A path into the form&apos;s AST, like <code>&quot;3,2,1&quot;</code> meaning &quot;third child, second child, first child&quot;</li></ul><p>Together, <code>[form-id coord]</code> pinpoints exactly which subexpression executed. This is the key that unlocks targeted test selection.</p><h3 id="the-coordinate-system">The Coordinate System</h3><p>To connect a mutation in the source code to the coverage data, we need a way to uniquely address any subexpression. Think of it as a postal address for code - we need to say &quot;the <code>a</code> inside the <code>+</code> call inside the function body&quot; in a format that both the coverage tracer and mutation engine can agree on.</p><p>ClojureStorm addresses this with a path-based coordinate system. Consider this function as a tree:</p><pre><code>(defn foo [a b] (+ a b))
   │
   ├─[0] defn
   ├─[1] foo
   ├─[2] [a b]
   └─[3] (+ a b)
            │
            ├─[3,0] +
            ├─[3,1] a
            └─[3,2] b
</code></pre><p>Each number represents which child to pick at each level. The coordinate <code>&quot;3,2&quot;</code> means &quot;go to child 3 (the function body), then child 2 (the second argument to <code>+</code>)&quot;. That gives us the <code>b</code> symbol.</p><p>This works cleanly for ordered structures like lists and vectors, where children have stable positions. But maps are unordered - <code>{:name &quot;Alice&quot; :age 30}</code> and <code>{:age 30 :name &quot;Alice&quot;}</code> are the same value, so numeric indices would be unstable.</p><p>ClojureStorm solves this by hashing the printed representation of map keys. Instead of <code>&quot;0&quot;</code> for the first entry, a key like <code>:name</code> gets addressed as <code>&quot;K-1925180523&quot;</code>:</p><pre><code>{:name &quot;Alice&quot; :age 30}
   │
   ├─[K-1925180523] :name
   ├─[V-1925180523] &quot;Alice&quot;
   ├─[K-1524292809] :age
   └─[V-1524292809] 30
</code></pre><p>The hash ensures stable addressing regardless of iteration order.</p><p>With this addressing scheme, we can say &quot;test X touched coordinate 3,1 in form 12345&quot; and later ask &quot;which tests touched the expression we&apos;re about to mutate?&quot;</p><h3 id="the-form-location-bridge">The Form-Location Bridge</h3><p>Here&apos;s a problem we discovered during implementation: how do we connect the mutation engine to the coverage data?</p><p>The mutation engine uses <a href="https://github.com/clj-commons/rewrite-clj">rewrite-clj</a> to parse and transform source files. It finds a mutation site at, say, line 42 of <code>src/my/app.clj</code>. But the coverage data is indexed by ClojureStorm&apos;s form-id - an opaque identifier assigned during compilation. We need to translate &quot;file + line&quot; into &quot;form-id&quot;.</p><p>Fortunately, ClojureStorm&apos;s FormRegistry stores the source file and starting line for each compiled form. We build a lookup index:</p><pre><code class="language-clojure">(defn build-form-location-index [forms source-paths]
  (into {}
        (for [[form-id {:keys [form/file form/line]}] forms
              :when (and file line)
              :let [abs-path (resolve-path source-paths file)]
              :when abs-path]
          [[abs-path line] form-id])))
</code></pre><p>When the mutation engine finds a site at line 42, it searches for the form whose start line is the largest value less than or equal to 42 - that is, the innermost containing form. This gives us the ClojureStorm form-id, which we use to look up which tests touched that form.</p><p>This bridging layer is what allows Heretic to connect source transformations to runtime coverage, enabling targeted test execution.</p><h3 id="collection-workflow">Collection Workflow</h3><p>Coverage collection runs each test individually and captures what it touches:</p><pre><code class="language-clojure">(defn run-test-with-coverage [test-var]
  (tracer/reset-current-coverage!)
  (try
    (test-var)
    (catch Throwable t
      (println &quot;Test threw exception:&quot; (.getMessage t))))
  {(symbol test-var) (tracer/get-current-coverage)})
</code></pre><p>The result is a map from test symbol to coverage data:</p><pre><code class="language-clojure">{my.app-test/test-addition
  {12345 #{&quot;3&quot; &quot;3,1&quot; &quot;3,2&quot;}    ;; form-id -&gt; coords touched
   12346 #{&quot;1&quot; &quot;2,1&quot;}}
 my.app-test/test-subtraction
  {12345 #{&quot;3&quot; &quot;4&quot;}
   12347 #{&quot;1&quot;}}}
</code></pre><p>This gets persisted to <code>.heretic/coverage/</code> with one file per test namespace, enabling incremental updates. Change a test file? Only that namespace gets recollected.</p><p>At this point we have a complete map: for every test, we know exactly which <code>[form-id coord]</code> pairs it touched. Now we need to generate mutations and look up which tests are relevant for each one.</p><h2 id="phase-2:-the-mutation-engine">Phase 2: The Mutation Engine</h2><p>With coverage data in hand, we need to actually mutate the code. This means:</p><ol start="1"><li>Parsing Clojure source into a navigable structure</li><li>Finding locations where operators apply</li><li>Transforming the source</li><li>Hot-swapping the modified code into the running JVM</li></ol><h3 id="parsing-with-rewrite-clj">Parsing with rewrite-clj</h3><p><a href="https://github.com/clj-commons/rewrite-clj">rewrite-clj</a> gives us a zipper over Clojure source that preserves whitespace and comments - essential for producing readable diffs:</p><pre><code class="language-clojure">(defn parse-file [path]
  (z/of-file path {:track-position? true}))

(defn find-mutation-sites [zloc]
  (-&gt;&gt; (walk-form zloc)
       (remove in-quoted-form?)  ;; Skip &apos;(...) and `(...)
       (mapcat (fn [z]
                 (let [applicable (ops/applicable-operators z)]
                   (map #(make-mutation-site z %) applicable))))))
</code></pre><p>The <code>walk-form</code> function traverses the zipper depth-first. At each node, we check which operators match. An operator is a data map with a matcher predicate:</p><pre><code class="language-clojure">(def swap-plus-minus
  {:id :swap-plus-minus
   :original &apos;+
   :replacement &apos;-
   :description &quot;Replace + with -&quot;
   :matcher (fn [zloc]
              (and (= :token (z/tag zloc))
                   (symbol? (z/sexpr zloc))
                   (= &apos;+ (z/sexpr zloc))))})
</code></pre><p>Each mutation site captures the file, line, column, operator, and - critically - the coordinate path within the form. This coordinate is what connects a mutation to the coverage data from Phase 1.</p><h3 id="coordinate-mapping">Coordinate Mapping</h3><p>The tricky part is converting between rewrite-clj&apos;s zipper positions and ClojureStorm&apos;s coordinate strings. We need bidirectional conversion for the round-trip:</p><pre><code class="language-clojure">(defn coord-&gt;zloc [zloc coord]
  (let [parts (parse-coord coord)]  ;; &quot;3,2,1&quot; -&gt; [3 2 1]
    (reduce
     (fn [z part]
       (when z
         (if (string? part)      ;; Hash-based for maps/sets
           (find-by-hash z part)
           (nth-child z part)))) ;; Integer index for lists/vectors
     zloc
     parts)))

(defn zloc-&gt;coord [zloc]
  (loop [z zloc
         coord []]
    (cond
      (root-form? z) (vec coord)
      (z/up z)
      (let [part (if (is-unordered-collection? z)
                   (compute-hash-coord z)
                   (child-index z))]
        (recur (z/up z) (cons part coord)))
      :else (vec coord))))
</code></pre><p>The validation requirement is that these must be inverses:</p><pre><code class="language-clojure">(= coord (zloc-&gt;coord (coord-&gt;zloc zloc coord)))
</code></pre><p>With correct coordinate mapping, we can take a mutation at a known location and ask &quot;which tests touched this exact spot?&quot; That query is what makes targeted test execution possible.</p><h3 id="applying-mutations">Applying Mutations</h3><p>Once we find a mutation site and can navigate to it, the actual transformation is straightforward:</p><pre><code class="language-clojure">(defn apply-mutation! [mutation]
  (let [{:keys [file form-id coord operator]} mutation
        operator-def (get ops/operators-by-id operator)
        original-content (slurp file)
        zloc (z/of-string original-content {:track-position? true})
        form-zloc (find-form-by-id zloc form-id)
        target-zloc (coord/coord-&gt;zloc form-zloc coord)
        replacement-str (ops/apply-operator operator-def target-zloc)
        modified-zloc (z/replace target-zloc
                                 (n/token-node (symbol replacement-str)))
        modified-content (z/root-string modified-zloc)]
    (spit file modified-content)
    (assoc mutation :backup original-content)))
</code></pre><h3 id="hot-swapping-with-clj-reload">Hot-Swapping with clj-reload</h3><p>After modifying the source file, we need the JVM to see the change. <a href="https://github.com/tonsky/clj-reload">clj-reload</a> handles this correctly:</p><pre><code class="language-clojure">(ns heretic.reloader
  (:require [clj-reload.core :as reload]))

(defn init! [source-paths]
  (reload/init {:dirs source-paths}))

(defn reload-after-mutation! []
  (reload/reload {:throw false}))
</code></pre><p>Why clj-reload specifically? It solves problems that <code>require :reload</code> doesn&apos;t:</p><ol start="1"><li><strong>Proper unloading</strong>: Calls <code>remove-ns</code> before reloading, preventing protocol/multimethod accumulation</li><li><strong>Dependency ordering</strong>: Topologically sorts namespaces, unloading dependents first</li><li><strong>Transitive closure</strong>: Automatically reloads namespaces that depend on the changed one</li></ol><p>The mutation workflow becomes:</p><pre><code class="language-clojure">(with-mutation [m mutation]
  (reloader/reload-after-mutation!)
  (run-relevant-tests m))
;; Mutation automatically reverted in finally block
</code></pre><p>At this point we have the full pipeline: parse source, find mutation sites, apply a mutation, hot-reload, run targeted tests, restore. But running this once per mutation is still slow for large codebases. Phase 3 addresses that.</p><h3 id="80+-clojure-specific-operators">80+ Clojure-Specific Operators</h3><p>The operator library is where Heretic&apos;s Clojure focus shows. Beyond the standard arithmetic and comparison swaps, we have:</p><p><strong>Threading operators</strong> - catch <code>-&gt;</code>/<code>-&gt;&gt;</code> confusion:</p><pre><code class="language-clojure">(-&gt; data (get :users) first)   ;; Original
(-&gt;&gt; data (get :users) first)  ;; Mutant: wrong arg position
</code></pre><p><strong>Nil-handling operators</strong> - expose nil punning mistakes:</p><pre><code class="language-clojure">(when (seq users) ...)   ;; Original: handles empty list
(when users ...)         ;; Mutant: breaks on empty list (truthy)
</code></pre><p><strong>Lazy/eager operators</strong> - catch chunking and realization bugs:</p><pre><code class="language-clojure">(map process items)    ;; Original: lazy
(mapv process items)   ;; Mutant: eager, different memory profile
</code></pre><p><strong>Destructuring operators</strong> - expose JSON interop issues:</p><pre><code class="language-clojure">{:keys [user-id]}   ;; Original: kebab-case
{:keys [userId]}    ;; Mutant: camelCase from JSON
</code></pre><p>The full set includes <code>first</code>/<code>last</code>, <code>rest</code>/<code>next</code>, <code>filter</code>/<code>remove</code>, <code>conj</code>/<code>disj</code>, <code>some-&gt;</code>/<code>-&gt;</code>, and qualified keyword mutations. These are the mistakes Clojure developers actually make.</p><p>With 80+ operators applied to a real codebase, mutation counts grow quickly. The next phase makes this tractable.</p><h2 id="phase-3:-optimization-techniques">Phase 3: Optimization Techniques</h2><p>With 80+ operators and a real codebase, mutation counts get large fast. A 1000-line project might generate 5000 mutations. Running the full test suite 5000 times is not practical.</p><p>Heretic uses several techniques to make this manageable.</p><h3 id="targeted-test-execution">Targeted Test Execution</h3><p>This is the big one, enabled by Phase 1. Instead of running all tests for every mutation, we query the coverage index:</p><pre><code class="language-clojure">(defn tests-for-mutation [coverage-map mutation]
  (let [form-id (resolve-form-id (:form-location-index coverage-map) mutation)
        coord (:coord mutation)]
    (get-in coverage-map [:coord-to-tests [form-id coord]] #{})))
</code></pre><p>A mutation at <code>(+ a b)</code> might only be covered by 2 tests out of 200. We run those 2 tests in milliseconds instead of the full suite in seconds.</p><p>This is where the Phase 1 coverage investment pays off. But we can go further by reducing the number of mutations we generate in the first place.</p><h3 id="equivalent-mutation-detection">Equivalent Mutation Detection</h3><p>Some mutations produce semantically identical code. Detecting these upfront avoids wasted test runs:</p><pre><code class="language-clojure">;; (* x 0) -&gt; (/ x 0) is NOT equivalent (divide by zero)
;; (* x 1) -&gt; (/ x 1) IS equivalent (both return x)

(def equivalent-patterns
  [{:operator :swap-mult-div
    :context (fn [zloc]
               (some #(= 1 %) (rest (z/child-sexprs (z/up zloc)))))
    :reason &quot;Multiplying or dividing by one has no effect&quot;}

   {:operator :swap-lt-lte
    :context (fn [zloc]
               (let [[_ left right] (z/child-sexprs (z/up zloc))]
                 (and (= 0 right)
                      (non-negative-fn? (first left)))))
    :reason &quot;(&lt; (count x) 0) is always false&quot;}])
</code></pre><p>The patterns cover boundary comparisons (<code>(&gt;= (count x) 0)</code> is always true), function contracts (<code>(nil? (str x))</code> is always false), and lazy/eager equivalences (<code>(vec (map f xs))</code> equals <code>(vec (mapv f xs))</code>).</p><p>Filtering equivalent mutations prevents false &quot;survived&quot; reports. But we can also skip mutations that would be redundant to test.</p><h3 id="subsumption-analysis">Subsumption Analysis</h3><p>Subsumption identifies when killing one mutation implies another would also be killed. If swapping <code>&lt;</code> to <code>&lt;=</code> is caught by a test, then swapping <code>&lt;</code> to <code>&gt;</code> would likely be caught too.</p><p>Based on the RORG (Relational Operator Replacement with Guard) research, we define subsumption relationships:</p><pre><code class="language-clojure">(def relational-operator-subsumption
  {&apos;&lt;  [:swap-lt-lte :swap-lt-neq :replace-comparison-false]
   &apos;&gt;  [:swap-gt-gte :swap-gt-neq :replace-comparison-false]
   &apos;&lt;= [:swap-lte-lt :swap-lte-eq :replace-comparison-true]
   ;; ...
   })
</code></pre><p>For each comparison operator, we only need to test the minimal set. The research shows this achieves roughly the same fault detection with 40% fewer mutations.</p><p>The subsumption graph also enables intelligent mutation selection:</p><pre><code class="language-clojure">(defn minimal-operator-set [operators]
  (set/difference
   operators
   ;; Remove any operator dominated by another in the set
   (reduce
    (fn [dominated op]
      (into dominated
            (set/intersection (dominated-operators op) operators)))
    #{}
    operators)))
</code></pre><p>These techniques reduce mutation count. The final optimization reduces the cost of each mutation.</p><h3 id="mutant-schemata:-compile-once,-select-at-runtime">Mutant Schemata: Compile Once, Select at Runtime</h3><p>The most sophisticated optimization is mutant schemata. Instead of applying one mutation, reloading, testing, reverting, reloading for each mutation, we embed multiple mutations into a single compilation:</p><pre><code class="language-clojure">;; Original
(defn calculate [x] (+ x 1))

;; Schematized (with 3 mutations)
(defn calculate [x]
  (case heretic.schemata/*active-mutant*
    :mut-42-5-plus-minus (- x 1)
    :mut-42-5-1-to-0     (+ x 0)
    :mut-42-5-1-to-2     (+ x 2)
    (+ x 1)))  ;; original (default)
</code></pre><p>We reload once, then switch between mutations by binding a dynamic var:</p><pre><code class="language-clojure">(def ^:dynamic *active-mutant* nil)

(defmacro with-mutant [mutation-id &amp; body]
  `(binding [*active-mutant* ~mutation-id]
     ~@body))
</code></pre><p>The workflow becomes:</p><pre><code class="language-clojure">(defn run-mutation-batch [file mutations test-fn]
  (let [schemata-info (schematize-file! file mutations)]
    (try
      (reload!)  ;; Once!
      (doseq [[id mutation] (:mutation-map schemata-info)]
        (with-mutant id
          (test-fn id mutation)))
      (finally
        (restore-file! schemata-info)
        (reload!)))))  ;; Once!
</code></pre><p>For a file with 50 mutations, this means 2 reloads instead of 100. The overhead of <code>case</code> dispatch at runtime is negligible compared to compilation cost.</p><h3 id="operator-presets">Operator Presets</h3><p>Finally, we offer presets that trade thoroughness for speed:</p><pre><code class="language-clojure">(def presets
  {:fast #{:swap-plus-minus :swap-minus-plus
           :swap-lt-gt :swap-gt-lt
           :swap-and-or :swap-or-and
           :swap-nil-some :swap-some-nil}

   :minimal minimal-preset-operators  ;; Subsumption-aware

   :standard #{;; :fast plus...
               :swap-first-last :swap-rest-next
               :swap-thread-first-last}

   :comprehensive (set (map :id all-operators))})
</code></pre><p>The <code>:fast</code> preset uses ~15 operators that research shows catch roughly 99% of bugs. The <code>:minimal</code> preset uses subsumption analysis to eliminate redundant mutations. Both run much faster than <code>:comprehensive</code> while maintaining detection power.</p><h2 id="putting-it-together">Putting It Together</h2><p>A mutation testing run with Heretic looks like:</p><ol start="1"><li><strong>Collect coverage</strong> (once, cached): Run tests under ClojureStorm instrumentation, build expression-level coverage map</li><li><strong>Generate mutations</strong>: Parse source files, find all applicable operator sites</li><li><strong>Filter</strong>: Remove equivalent mutations, apply subsumption to reduce set</li><li><strong>Group by file</strong>: Prepare for schemata optimization</li><li><strong>For each file</strong>:<ul><li>Build schematized source with all mutations</li><li>Reload once</li><li>For each mutation: bind <code>*active-mutant*</code>, run targeted tests</li><li>Restore and reload</li></ul></li><li><strong>Report</strong>: Mutation score, surviving mutations, test effectiveness</li></ol><p>The result is mutation testing that runs in seconds for typical projects instead of hours.</p><hr /><p>This covers the core implementation. A future post will explore Phase 4: AI-powered semantic mutations and hybrid equivalent detection - using LLMs to generate the subtle, domain-aware mutations that traditional operators miss.</p><p><strong>Previously:</strong> <a href="/2025-12-28-heretic-mutation-testing.html">Part 1 - Heretic: Mutation Testing in Clojure</a></p></div>]]></content>
  </entry>
  <entry>
    <id>https://blog.parenstech.com/2025-12-28-heretic-mutation-testing.html</id>
    <link href="https://blog.parenstech.com/2025-12-28-heretic-mutation-testing.html"/>
    <title>Heretic: Mutation Testing in Clojure</title>
    <updated>2025-12-28T23:59:59+00:00</updated>
    <content type="html"><![CDATA[<div><img src="/img/heretic-logo.webp" alt="Heretic" width="50%"><p>Your tests pass. Your coverage is high. You deploy.</p><p>Three days later, a bug surfaces in a function your tests definitely executed. The coverage report confirms it: that line is green. Your test ran the code. So how did a bug slip through?</p><p>Because coverage measures execution, not verification.</p><pre><code class="language-clojure">(defn apply-discount [price user]
  (if (:premium user)
    (* price 0.8)
    price))

(deftest apply-discount-test
  (is (number? (apply-discount 100 {:premium true})))
  (is (number? (apply-discount 100 {:premium false}))))
</code></pre><p>Coverage: 100%. Every branch executed. Tests: green.</p><p>But swap <code>0.8</code> for <code>1.2</code>? Tests pass. Change <code>*</code> to <code>/</code>? Tests pass. Flip <code>(:premium user)</code> to <code>(not (:premium user))</code>? Tests pass.</p><p>The tests prove <em>some</em> number comes back. They say nothing about whether it&apos;s the right number.</p><h2 id="the-question-nobody&apos;s-asking">The Question Nobody&apos;s Asking</h2><p>Mutation testing asks a harder question: if I introduced a bug, would any test notice?</p><p>The technique is simple. Take your code, introduce a small change (a &quot;mutant&quot;), and run your tests. If a test fails, the mutant is &quot;killed&quot; - your tests caught the bug. If all tests pass, the mutant &quot;survived&quot; - you&apos;ve found a gap in your verification.</p><p>This isn&apos;t new. <a href="https://pitest.org/">PIT</a> does it for Java. <a href="https://stryker-mutator.io/">Stryker</a> does it for JavaScript. <a href="https://mutants.rs/">cargo-mutants</a> does it for Rust.</p><p>Clojure hasn&apos;t had a practical option.</p><p>The only dedicated tool, <a href="https://github.com/jstepien/mutant">jstepien/mutant</a>, was archived this year as &quot;wildly experimental.&quot; You can run PIT on Clojure bytecode, but bytecode mutations bear no relationship to mistakes Clojure developers actually make. You&apos;ll get mutations like &quot;swap IADD for ISUB&quot; when what you want is &quot;swap <code>-&gt;</code> for <code>-&gt;&gt;</code> &quot; or &quot;change <code>:user-id</code> to <code>:userId</code>.&quot;</p><h2 id="why-clojure-makes-this-hard">Why Clojure Makes This Hard</h2><p>Mutation testing has a performance problem everywhere. Run 500 mutations, execute your full test suite for each one, and you&apos;re measuring build times in hours. Most developers try it once, watch the clock, and never run it again.</p><p>But Clojure adds unique challenges:</p><p><strong>Homoiconicity cuts both ways.</strong> Code-as-data makes programmatic transformation elegant, but distinguishing &quot;meaningful mutation&quot; from &quot;syntactic noise&quot; gets subtle when everything is just nested lists.</p><p><strong>Macros muddy the waters.</strong> A mutation to macro input might not change the expanded code. A mutation inside a macro definition might break in ways that have nothing to do with your production logic.</p><p><strong>The bugs we make are language-specific.</strong> Threading macro confusion, nil punning traps, destructuring gotchas from JSON interop, keyword naming collisions - these aren&apos;t <code>+</code> becoming <code>-</code>. They&apos;re mistakes that come from thinking in Clojure.</p><h2 id="what-if-it-could-be-fast?">What If It Could Be Fast?</h2><p>The insight that makes <a href="https://github.com/parenstech/heretic">Heretic</a> practical: most mutations only need 2-3 tests.</p><p>When you mutate a single expression, you don&apos;t need your entire test suite. You need only the tests that exercise that expression. Usually that&apos;s a handful of tests, not hundreds.</p><p>The challenge is knowing which ones. Not just which functions they call, but which <em>subexpressions</em> they touch. The <code>+</code> inside <code>(if condition (+ a b) (* a b))</code> might be covered by different tests than the <code>*</code>.</p><p>Heretic builds this map using <a href="https://github.com/flow-storm/clojure">ClojureStorm</a>, the instrumented compiler behind <a href="https://github.com/flow-storm/flow-storm-debugger">FlowStorm</a>. Run your tests once under instrumentation. From then on, each mutation runs only the tests that actually touch that code.</p><p>Instead of running 200 tests per mutation, we run 2. Instead of hours, seconds.</p><h2 id="what-if-it-understood-clojure?">What If It Understood Clojure?</h2><p>Generic operators miss the bugs we actually make:</p><pre><code class="language-clojure">;; The mutation you want: threading macro confusion
(-&gt; data (get :users) first)     ; Original
(-&gt;&gt; data (get :users) first)    ; Mutant: wrong arg position, wrong result

;; The mutation you want: nil punning trap
(when (seq users) (map :name users))   ; Original (handles empty)
(when users (map :name users))         ; Mutant (breaks on empty list)

;; The mutation you want: destructuring gotcha
{:keys [user-id name]}           ; Original (kebab-case)
{:keys [userId name]}            ; Mutant (camelCase from JSON)
</code></pre><p>Heretic has 65+ mutation operators designed for Clojure idioms. Swap <code>first</code> for <code>last</code>. Change <code>rest</code> to <code>next</code>. Replace <code>-&gt;</code> with <code>some-&gt;</code>. Mutate qualified keywords. The mutations you see will be the bugs you recognize.</p><h2 id="what-if-it-could-think?">What If It Could Think?</h2><p>Here&apos;s a finding that should worry anyone relying on traditional mutation testing: <a href="https://research.chalmers.se/en/publication/536348">research shows</a> that nearly half of real-world faults have no strongly coupled traditional mutant. The bugs that escape to production aren&apos;t the ones that flip operators. They&apos;re the ones that invert business logic.</p><pre><code class="language-clojure">;; Traditional mutation: swap * for /
(* price 0.8)  --&gt;  (/ price 0.8)     ; Absurd. Nobody writes this bug.

;; Semantic mutation: invert the discount
(* price 0.8)  --&gt;  (* price 1.2)     ; Premium users pay MORE. Plausible bug.
</code></pre><p>A function called <code>apply-discount</code> should never increase the price. That&apos;s the invariant tests should verify. An AI can read function names, docstrings, and context to generate the mutations that <em>test whether your tests understand the code&apos;s purpose</em>.</p><p>This hybrid approach - fast deterministic mutations for the common cases, intelligent semantic mutations for the subtle ones - is where Heretic is heading. <a href="https://engineering.fb.com/2025/02/05/security/revolutionizing-software-testing-llm-powered-bug-catchers-meta-ach/">Meta&apos;s ACH system</a> proved the pattern works at industrial scale.</p><h2 id="why-&quot;heretic&quot;?">Why &quot;Heretic&quot;?</h2><p>Clojure discourages mutation. Values are immutable. State changes through controlled transitions. The design philosophy is that uncontrolled mutation leads to bugs.</p><p>So there&apos;s something a bit ironic about a tool that deliberately introduces mutations to find those bugs. We mutate your code to prove your tests would catch it if it happened accidentally - to verify that the discipline holds.</p><hr /><p>This is the first in a series on building Heretic. Upcoming posts will cover how ClojureStorm enables expression-level coverage mapping, how we use <a href="https://github.com/clj-commons/rewrite-clj">rewrite-clj</a> and <a href="https://github.com/tonsky/clj-reload">clj-reload</a> for hot-swapping mutants, and the optimization techniques that make this practical for real codebases.</p><p>If your coverage is high but bugs still slip through, you&apos;re measuring the wrong thing.</p></div>]]></content>
  </entry>
</feed>
