<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Lab Notes</title>
  <link href="https://blog.parenstech.com/atom.xml" rel="self"/>
  <link href="https://blog.parenstech.com"/>
  <updated>2025-12-30T13:15:24+00:00</updated>
  <id>https://blog.parenstech.com</id>
  <author>
    <name>Yenda</name>
  </author>
  <entry>
    <id>https://blog.parenstech.com/2025-12-30-building-heretic.html</id>
    <link href="https://blog.parenstech.com/2025-12-30-building-heretic.html"/>
    <title>Building Heretic: From ClojureStorm to Mutant Schemata</title>
    <updated>2025-12-30T23:59:59+00:00</updated>
    <content type="html"><![CDATA[<div><img src="/img/heretic-logo.webp" alt="Heretic" width="50%"><p><em>This is Part 2 of a series on mutation testing in Clojure. <a href="/2025-12-28-heretic-mutation-testing.html">Part 1</a> introduced the concept and why Clojure needed a purpose-built tool.</em></p><p>The previous post made a claim: mutation testing can be fast if you know which tests to run. This post shows how <a href="https://github.com/parenstech/heretic">Heretic</a> makes that happen.</p><p>We&apos;ll walk through the three core phases: collecting expression-level coverage with <a href="https://github.com/flow-storm/clojure">ClojureStorm</a>, transforming source code with <a href="https://github.com/clj-commons/rewrite-clj">rewrite-clj</a>, and the optimization techniques that keep mutation counts manageable.</p><h2 id="phase-1:-coverage-collection">Phase 1: Coverage Collection</h2><p>Traditional coverage tools track lines. Heretic tracks expressions.</p><p>The difference matters. Consider:</p><pre><code class="language-clojure">(defn process-order [order]
  (if (&gt; (:quantity order) 10)
    (* (:price order) 0.9)    ;; &lt;- Line 3: bulk discount
    (:price order)))
</code></pre><p>Line-level coverage would show line 3 as &quot;covered&quot; if any test enters the bulk discount branch. But expression-level coverage distinguishes between tests that evaluate <code>*</code>, <code>(:price order)</code>, and <code>0.9</code>. When we later mutate <code>0.9</code> to <code>1.1</code>, we can run only the tests that actually touched that specific literal - not every test that happened to call <code>process-order</code>.</p><h3 id="clojurestorm&apos;s-instrumented-compiler">ClojureStorm&apos;s Instrumented Compiler</h3><p><a href="https://github.com/flow-storm/clojure">ClojureStorm</a> is a fork of the Clojure compiler that instruments every expression during compilation. Created by <a href="https://github.com/jpmonettas">Juan Monetta</a> for the <a href="https://github.com/flow-storm/flow-storm-debugger">FlowStorm</a> debugger, it provides exactly the hooks Heretic needs. (Thanks to Juan for building such a solid foundation - Heretic would not exist without ClojureStorm.)</p><p>The integration is surprisingly minimal:</p><pre><code class="language-clojure">(ns heretic.tracer
  (:import [clojure.storm Emitter Tracer]))

(def ^:private current-coverage
  &quot;Atom of {form-id #{coords}} for the currently running test.&quot;
  (atom {}))

(defn record-hit! [form-id coord]
  (swap! current-coverage
         update form-id
         (fnil conj #{})
         coord))

(defn init! []
  ;; Configure what gets instrumented
  (Emitter/setInstrumentationEnable true)
  (Emitter/setFnReturnInstrumentationEnable true)
  (Emitter/setExprInstrumentationEnable true)

  ;; Set up callbacks
  (Tracer/setTraceFnsCallbacks
   {:trace-expr-fn (fn [_ _ coord form-id]
                     (record-hit! form-id coord))
    :trace-fn-return-fn (fn [_ _ coord form-id]
                          (record-hit! form-id coord))}))
</code></pre><p>When any instrumented expression evaluates, ClojureStorm calls our callback with two pieces of information:</p><ul><li><strong>form-id</strong>: A unique identifier for the top-level form (e.g., an entire <code>defn</code>)</li><li><strong>coord</strong>: A path into the form&apos;s AST, like <code>&quot;3,2,1&quot;</code> meaning &quot;third child, second child, first child&quot;</li></ul><p>Together, <code>[form-id coord]</code> pinpoints exactly which subexpression executed. This is the key that unlocks targeted test selection.</p><h3 id="the-coordinate-system">The Coordinate System</h3><p>To connect a mutation in the source code to the coverage data, we need a way to uniquely address any subexpression. Think of it as a postal address for code - we need to say &quot;the <code>a</code> inside the <code>+</code> call inside the function body&quot; in a format that both the coverage tracer and mutation engine can agree on.</p><p>ClojureStorm addresses this with a path-based coordinate system. Consider this function as a tree:</p><pre><code>(defn foo [a b] (+ a b))
   │
   ├─[0] defn
   ├─[1] foo
   ├─[2] [a b]
   └─[3] (+ a b)
            │
            ├─[3,0] +
            ├─[3,1] a
            └─[3,2] b
</code></pre><p>Each number represents which child to pick at each level. The coordinate <code>&quot;3,2&quot;</code> means &quot;go to child 3 (the function body), then child 2 (the second argument to <code>+</code>)&quot;. That gives us the <code>b</code> symbol.</p><p>This works cleanly for ordered structures like lists and vectors, where children have stable positions. But maps are unordered - <code>{:name &quot;Alice&quot; :age 30}</code> and <code>{:age 30 :name &quot;Alice&quot;}</code> are the same value, so numeric indices would be unstable.</p><p>ClojureStorm solves this by hashing the printed representation of map keys. Instead of <code>&quot;0&quot;</code> for the first entry, a key like <code>:name</code> gets addressed as <code>&quot;K-1925180523&quot;</code>:</p><pre><code>{:name &quot;Alice&quot; :age 30}
   │
   ├─[K-1925180523] :name
   ├─[V-1925180523] &quot;Alice&quot;
   ├─[K-1524292809] :age
   └─[V-1524292809] 30
</code></pre><p>The hash ensures stable addressing regardless of iteration order.</p><p>With this addressing scheme, we can say &quot;test X touched coordinate 3,1 in form 12345&quot; and later ask &quot;which tests touched the expression we&apos;re about to mutate?&quot;</p><h3 id="the-form-location-bridge">The Form-Location Bridge</h3><p>Here&apos;s a problem we discovered during implementation: how do we connect the mutation engine to the coverage data?</p><p>The mutation engine uses <a href="https://github.com/clj-commons/rewrite-clj">rewrite-clj</a> to parse and transform source files. It finds a mutation site at, say, line 42 of <code>src/my/app.clj</code>. But the coverage data is indexed by ClojureStorm&apos;s form-id - an opaque identifier assigned during compilation. We need to translate &quot;file + line&quot; into &quot;form-id&quot;.</p><p>Fortunately, ClojureStorm&apos;s FormRegistry stores the source file and starting line for each compiled form. We build a lookup index:</p><pre><code class="language-clojure">(defn build-form-location-index [forms source-paths]
  (into {}
        (for [[form-id {:keys [form/file form/line]}] forms
              :when (and file line)
              :let [abs-path (resolve-path source-paths file)]
              :when abs-path]
          [[abs-path line] form-id])))
</code></pre><p>When the mutation engine finds a site at line 42, it searches for the form whose start line is the largest value less than or equal to 42 - that is, the innermost containing form. This gives us the ClojureStorm form-id, which we use to look up which tests touched that form.</p><p>This bridging layer is what allows Heretic to connect source transformations to runtime coverage, enabling targeted test execution.</p><h3 id="collection-workflow">Collection Workflow</h3><p>Coverage collection runs each test individually and captures what it touches:</p><pre><code class="language-clojure">(defn run-test-with-coverage [test-var]
  (tracer/reset-current-coverage!)
  (try
    (test-var)
    (catch Throwable t
      (println &quot;Test threw exception:&quot; (.getMessage t))))
  {(symbol test-var) (tracer/get-current-coverage)})
</code></pre><p>The result is a map from test symbol to coverage data:</p><pre><code class="language-clojure">{my.app-test/test-addition
  {12345 #{&quot;3&quot; &quot;3,1&quot; &quot;3,2&quot;}    ;; form-id -&gt; coords touched
   12346 #{&quot;1&quot; &quot;2,1&quot;}}
 my.app-test/test-subtraction
  {12345 #{&quot;3&quot; &quot;4&quot;}
   12347 #{&quot;1&quot;}}}
</code></pre><p>This gets persisted to <code>.heretic/coverage/</code> with one file per test namespace, enabling incremental updates. Change a test file? Only that namespace gets recollected.</p><p>At this point we have a complete map: for every test, we know exactly which <code>[form-id coord]</code> pairs it touched. Now we need to generate mutations and look up which tests are relevant for each one.</p><h2 id="phase-2:-the-mutation-engine">Phase 2: The Mutation Engine</h2><p>With coverage data in hand, we need to actually mutate the code. This means:</p><ol start="1"><li>Parsing Clojure source into a navigable structure</li><li>Finding locations where operators apply</li><li>Transforming the source</li><li>Hot-swapping the modified code into the running JVM</li></ol><h3 id="parsing-with-rewrite-clj">Parsing with rewrite-clj</h3><p><a href="https://github.com/clj-commons/rewrite-clj">rewrite-clj</a> gives us a zipper over Clojure source that preserves whitespace and comments - essential for producing readable diffs:</p><pre><code class="language-clojure">(defn parse-file [path]
  (z/of-file path {:track-position? true}))

(defn find-mutation-sites [zloc]
  (-&gt;&gt; (walk-form zloc)
       (remove in-quoted-form?)  ;; Skip &apos;(...) and `(...)
       (mapcat (fn [z]
                 (let [applicable (ops/applicable-operators z)]
                   (map #(make-mutation-site z %) applicable))))))
</code></pre><p>The <code>walk-form</code> function traverses the zipper depth-first. At each node, we check which operators match. An operator is a data map with a matcher predicate:</p><pre><code class="language-clojure">(def swap-plus-minus
  {:id :swap-plus-minus
   :original &apos;+
   :replacement &apos;-
   :description &quot;Replace + with -&quot;
   :matcher (fn [zloc]
              (and (= :token (z/tag zloc))
                   (symbol? (z/sexpr zloc))
                   (= &apos;+ (z/sexpr zloc))))})
</code></pre><p>Each mutation site captures the file, line, column, operator, and - critically - the coordinate path within the form. This coordinate is what connects a mutation to the coverage data from Phase 1.</p><h3 id="coordinate-mapping">Coordinate Mapping</h3><p>The tricky part is converting between rewrite-clj&apos;s zipper positions and ClojureStorm&apos;s coordinate strings. We need bidirectional conversion for the round-trip:</p><pre><code class="language-clojure">(defn coord-&gt;zloc [zloc coord]
  (let [parts (parse-coord coord)]  ;; &quot;3,2,1&quot; -&gt; [3 2 1]
    (reduce
     (fn [z part]
       (when z
         (if (string? part)      ;; Hash-based for maps/sets
           (find-by-hash z part)
           (nth-child z part)))) ;; Integer index for lists/vectors
     zloc
     parts)))

(defn zloc-&gt;coord [zloc]
  (loop [z zloc
         coord []]
    (cond
      (root-form? z) (vec coord)
      (z/up z)
      (let [part (if (is-unordered-collection? z)
                   (compute-hash-coord z)
                   (child-index z))]
        (recur (z/up z) (cons part coord)))
      :else (vec coord))))
</code></pre><p>The validation requirement is that these must be inverses:</p><pre><code class="language-clojure">(= coord (zloc-&gt;coord (coord-&gt;zloc zloc coord)))
</code></pre><p>With correct coordinate mapping, we can take a mutation at a known location and ask &quot;which tests touched this exact spot?&quot; That query is what makes targeted test execution possible.</p><h3 id="applying-mutations">Applying Mutations</h3><p>Once we find a mutation site and can navigate to it, the actual transformation is straightforward:</p><pre><code class="language-clojure">(defn apply-mutation! [mutation]
  (let [{:keys [file form-id coord operator]} mutation
        operator-def (get ops/operators-by-id operator)
        original-content (slurp file)
        zloc (z/of-string original-content {:track-position? true})
        form-zloc (find-form-by-id zloc form-id)
        target-zloc (coord/coord-&gt;zloc form-zloc coord)
        replacement-str (ops/apply-operator operator-def target-zloc)
        modified-zloc (z/replace target-zloc
                                 (n/token-node (symbol replacement-str)))
        modified-content (z/root-string modified-zloc)]
    (spit file modified-content)
    (assoc mutation :backup original-content)))
</code></pre><h3 id="hot-swapping-with-clj-reload">Hot-Swapping with clj-reload</h3><p>After modifying the source file, we need the JVM to see the change. <a href="https://github.com/tonsky/clj-reload">clj-reload</a> handles this correctly:</p><pre><code class="language-clojure">(ns heretic.reloader
  (:require [clj-reload.core :as reload]))

(defn init! [source-paths]
  (reload/init {:dirs source-paths}))

(defn reload-after-mutation! []
  (reload/reload {:throw false}))
</code></pre><p>Why clj-reload specifically? It solves problems that <code>require :reload</code> doesn&apos;t:</p><ol start="1"><li><strong>Proper unloading</strong>: Calls <code>remove-ns</code> before reloading, preventing protocol/multimethod accumulation</li><li><strong>Dependency ordering</strong>: Topologically sorts namespaces, unloading dependents first</li><li><strong>Transitive closure</strong>: Automatically reloads namespaces that depend on the changed one</li></ol><p>The mutation workflow becomes:</p><pre><code class="language-clojure">(with-mutation [m mutation]
  (reloader/reload-after-mutation!)
  (run-relevant-tests m))
;; Mutation automatically reverted in finally block
</code></pre><p>At this point we have the full pipeline: parse source, find mutation sites, apply a mutation, hot-reload, run targeted tests, restore. But running this once per mutation is still slow for large codebases. Phase 3 addresses that.</p><h3 id="80+-clojure-specific-operators">80+ Clojure-Specific Operators</h3><p>The operator library is where Heretic&apos;s Clojure focus shows. Beyond the standard arithmetic and comparison swaps, we have:</p><p><strong>Threading operators</strong> - catch <code>-&gt;</code>/<code>-&gt;&gt;</code> confusion:</p><pre><code class="language-clojure">(-&gt; data (get :users) first)   ;; Original
(-&gt;&gt; data (get :users) first)  ;; Mutant: wrong arg position
</code></pre><p><strong>Nil-handling operators</strong> - expose nil punning mistakes:</p><pre><code class="language-clojure">(when (seq users) ...)   ;; Original: handles empty list
(when users ...)         ;; Mutant: breaks on empty list (truthy)
</code></pre><p><strong>Lazy/eager operators</strong> - catch chunking and realization bugs:</p><pre><code class="language-clojure">(map process items)    ;; Original: lazy
(mapv process items)   ;; Mutant: eager, different memory profile
</code></pre><p><strong>Destructuring operators</strong> - expose JSON interop issues:</p><pre><code class="language-clojure">{:keys [user-id]}   ;; Original: kebab-case
{:keys [userId]}    ;; Mutant: camelCase from JSON
</code></pre><p>The full set includes <code>first</code>/<code>last</code>, <code>rest</code>/<code>next</code>, <code>filter</code>/<code>remove</code>, <code>conj</code>/<code>disj</code>, <code>some-&gt;</code>/<code>-&gt;</code>, and qualified keyword mutations. These are the mistakes Clojure developers actually make.</p><p>With 80+ operators applied to a real codebase, mutation counts grow quickly. The next phase makes this tractable.</p><h2 id="phase-3:-optimization-techniques">Phase 3: Optimization Techniques</h2><p>With 80+ operators and a real codebase, mutation counts get large fast. A 1000-line project might generate 5000 mutations. Running the full test suite 5000 times is not practical.</p><p>Heretic uses several techniques to make this manageable.</p><h3 id="targeted-test-execution">Targeted Test Execution</h3><p>This is the big one, enabled by Phase 1. Instead of running all tests for every mutation, we query the coverage index:</p><pre><code class="language-clojure">(defn tests-for-mutation [coverage-map mutation]
  (let [form-id (resolve-form-id (:form-location-index coverage-map) mutation)
        coord (:coord mutation)]
    (get-in coverage-map [:coord-to-tests [form-id coord]] #{})))
</code></pre><p>A mutation at <code>(+ a b)</code> might only be covered by 2 tests out of 200. We run those 2 tests in milliseconds instead of the full suite in seconds.</p><p>This is where the Phase 1 coverage investment pays off. But we can go further by reducing the number of mutations we generate in the first place.</p><h3 id="equivalent-mutation-detection">Equivalent Mutation Detection</h3><p>Some mutations produce semantically identical code. Detecting these upfront avoids wasted test runs:</p><pre><code class="language-clojure">;; (* x 0) -&gt; (/ x 0) is NOT equivalent (divide by zero)
;; (* x 1) -&gt; (/ x 1) IS equivalent (both return x)

(def equivalent-patterns
  [{:operator :swap-mult-div
    :context (fn [zloc]
               (some #(= 1 %) (rest (z/child-sexprs (z/up zloc)))))
    :reason &quot;Multiplying or dividing by one has no effect&quot;}

   {:operator :swap-lt-lte
    :context (fn [zloc]
               (let [[_ left right] (z/child-sexprs (z/up zloc))]
                 (and (= 0 right)
                      (non-negative-fn? (first left)))))
    :reason &quot;(&lt; (count x) 0) is always false&quot;}])
</code></pre><p>The patterns cover boundary comparisons (<code>(&gt;= (count x) 0)</code> is always true), function contracts (<code>(nil? (str x))</code> is always false), and lazy/eager equivalences (<code>(vec (map f xs))</code> equals <code>(vec (mapv f xs))</code>).</p><p>Filtering equivalent mutations prevents false &quot;survived&quot; reports. But we can also skip mutations that would be redundant to test.</p><h3 id="subsumption-analysis">Subsumption Analysis</h3><p>Subsumption identifies when killing one mutation implies another would also be killed. If swapping <code>&lt;</code> to <code>&lt;=</code> is caught by a test, then swapping <code>&lt;</code> to <code>&gt;</code> would likely be caught too.</p><p>Based on the RORG (Relational Operator Replacement with Guard) research, we define subsumption relationships:</p><pre><code class="language-clojure">(def relational-operator-subsumption
  {&apos;&lt;  [:swap-lt-lte :swap-lt-neq :replace-comparison-false]
   &apos;&gt;  [:swap-gt-gte :swap-gt-neq :replace-comparison-false]
   &apos;&lt;= [:swap-lte-lt :swap-lte-eq :replace-comparison-true]
   ;; ...
   })
</code></pre><p>For each comparison operator, we only need to test the minimal set. The research shows this achieves roughly the same fault detection with 40% fewer mutations.</p><p>The subsumption graph also enables intelligent mutation selection:</p><pre><code class="language-clojure">(defn minimal-operator-set [operators]
  (set/difference
   operators
   ;; Remove any operator dominated by another in the set
   (reduce
    (fn [dominated op]
      (into dominated
            (set/intersection (dominated-operators op) operators)))
    #{}
    operators)))
</code></pre><p>These techniques reduce mutation count. The final optimization reduces the cost of each mutation.</p><h3 id="mutant-schemata:-compile-once,-select-at-runtime">Mutant Schemata: Compile Once, Select at Runtime</h3><p>The most sophisticated optimization is mutant schemata. Instead of applying one mutation, reloading, testing, reverting, reloading for each mutation, we embed multiple mutations into a single compilation:</p><pre><code class="language-clojure">;; Original
(defn calculate [x] (+ x 1))

;; Schematized (with 3 mutations)
(defn calculate [x]
  (case heretic.schemata/*active-mutant*
    :mut-42-5-plus-minus (- x 1)
    :mut-42-5-1-to-0     (+ x 0)
    :mut-42-5-1-to-2     (+ x 2)
    (+ x 1)))  ;; original (default)
</code></pre><p>We reload once, then switch between mutations by binding a dynamic var:</p><pre><code class="language-clojure">(def ^:dynamic *active-mutant* nil)

(defmacro with-mutant [mutation-id &amp; body]
  `(binding [*active-mutant* ~mutation-id]
     ~@body))
</code></pre><p>The workflow becomes:</p><pre><code class="language-clojure">(defn run-mutation-batch [file mutations test-fn]
  (let [schemata-info (schematize-file! file mutations)]
    (try
      (reload!)  ;; Once!
      (doseq [[id mutation] (:mutation-map schemata-info)]
        (with-mutant id
          (test-fn id mutation)))
      (finally
        (restore-file! schemata-info)
        (reload!)))))  ;; Once!
</code></pre><p>For a file with 50 mutations, this means 2 reloads instead of 100. The overhead of <code>case</code> dispatch at runtime is negligible compared to compilation cost.</p><h3 id="operator-presets">Operator Presets</h3><p>Finally, we offer presets that trade thoroughness for speed:</p><pre><code class="language-clojure">(def presets
  {:fast #{:swap-plus-minus :swap-minus-plus
           :swap-lt-gt :swap-gt-lt
           :swap-and-or :swap-or-and
           :swap-nil-some :swap-some-nil}

   :minimal minimal-preset-operators  ;; Subsumption-aware

   :standard #{;; :fast plus...
               :swap-first-last :swap-rest-next
               :swap-thread-first-last}

   :comprehensive (set (map :id all-operators))})
</code></pre><p>The <code>:fast</code> preset uses ~15 operators that research shows catch roughly 99% of bugs. The <code>:minimal</code> preset uses subsumption analysis to eliminate redundant mutations. Both run much faster than <code>:comprehensive</code> while maintaining detection power.</p><h2 id="putting-it-together">Putting It Together</h2><p>A mutation testing run with Heretic looks like:</p><ol start="1"><li><strong>Collect coverage</strong> (once, cached): Run tests under ClojureStorm instrumentation, build expression-level coverage map</li><li><strong>Generate mutations</strong>: Parse source files, find all applicable operator sites</li><li><strong>Filter</strong>: Remove equivalent mutations, apply subsumption to reduce set</li><li><strong>Group by file</strong>: Prepare for schemata optimization</li><li><strong>For each file</strong>:<ul><li>Build schematized source with all mutations</li><li>Reload once</li><li>For each mutation: bind <code>*active-mutant*</code>, run targeted tests</li><li>Restore and reload</li></ul></li><li><strong>Report</strong>: Mutation score, surviving mutations, test effectiveness</li></ol><p>The result is mutation testing that runs in seconds for typical projects instead of hours.</p><hr /><p>This covers the core implementation. A future post will explore Phase 4: AI-powered semantic mutations and hybrid equivalent detection - using LLMs to generate the subtle, domain-aware mutations that traditional operators miss.</p><p><strong>Previously:</strong> <a href="/2025-12-28-heretic-mutation-testing.html">Part 1 - Heretic: Mutation Testing in Clojure</a></p></div>]]></content>
  </entry>
  <entry>
    <id>https://blog.parenstech.com/2025-12-28-heretic-mutation-testing.html</id>
    <link href="https://blog.parenstech.com/2025-12-28-heretic-mutation-testing.html"/>
    <title>Heretic: Mutation Testing in Clojure</title>
    <updated>2025-12-28T23:59:59+00:00</updated>
    <content type="html"><![CDATA[<div><img src="/img/heretic-logo.webp" alt="Heretic" width="50%"><p>Your tests pass. Your coverage is high. You deploy.</p><p>Three days later, a bug surfaces in a function your tests definitely executed. The coverage report confirms it: that line is green. Your test ran the code. So how did a bug slip through?</p><p>Because coverage measures execution, not verification.</p><pre><code class="language-clojure">(defn apply-discount [price user]
  (if (:premium user)
    (* price 0.8)
    price))

(deftest apply-discount-test
  (is (number? (apply-discount 100 {:premium true})))
  (is (number? (apply-discount 100 {:premium false}))))
</code></pre><p>Coverage: 100%. Every branch executed. Tests: green.</p><p>But swap <code>0.8</code> for <code>1.2</code>? Tests pass. Change <code>*</code> to <code>/</code>? Tests pass. Flip <code>(:premium user)</code> to <code>(not (:premium user))</code>? Tests pass.</p><p>The tests prove <em>some</em> number comes back. They say nothing about whether it&apos;s the right number.</p><h2 id="the-question-nobody&apos;s-asking">The Question Nobody&apos;s Asking</h2><p>Mutation testing asks a harder question: if I introduced a bug, would any test notice?</p><p>The technique is simple. Take your code, introduce a small change (a &quot;mutant&quot;), and run your tests. If a test fails, the mutant is &quot;killed&quot; - your tests caught the bug. If all tests pass, the mutant &quot;survived&quot; - you&apos;ve found a gap in your verification.</p><p>This isn&apos;t new. <a href="https://pitest.org/">PIT</a> does it for Java. <a href="https://stryker-mutator.io/">Stryker</a> does it for JavaScript. <a href="https://mutants.rs/">cargo-mutants</a> does it for Rust.</p><p>Clojure hasn&apos;t had a practical option.</p><p>The only dedicated tool, <a href="https://github.com/jstepien/mutant">jstepien/mutant</a>, was archived this year as &quot;wildly experimental.&quot; You can run PIT on Clojure bytecode, but bytecode mutations bear no relationship to mistakes Clojure developers actually make. You&apos;ll get mutations like &quot;swap IADD for ISUB&quot; when what you want is &quot;swap <code>-&gt;</code> for <code>-&gt;&gt;</code> &quot; or &quot;change <code>:user-id</code> to <code>:userId</code>.&quot;</p><h2 id="why-clojure-makes-this-hard">Why Clojure Makes This Hard</h2><p>Mutation testing has a performance problem everywhere. Run 500 mutations, execute your full test suite for each one, and you&apos;re measuring build times in hours. Most developers try it once, watch the clock, and never run it again.</p><p>But Clojure adds unique challenges:</p><p><strong>Homoiconicity cuts both ways.</strong> Code-as-data makes programmatic transformation elegant, but distinguishing &quot;meaningful mutation&quot; from &quot;syntactic noise&quot; gets subtle when everything is just nested lists.</p><p><strong>Macros muddy the waters.</strong> A mutation to macro input might not change the expanded code. A mutation inside a macro definition might break in ways that have nothing to do with your production logic.</p><p><strong>The bugs we make are language-specific.</strong> Threading macro confusion, nil punning traps, destructuring gotchas from JSON interop, keyword naming collisions - these aren&apos;t <code>+</code> becoming <code>-</code>. They&apos;re mistakes that come from thinking in Clojure.</p><h2 id="what-if-it-could-be-fast?">What If It Could Be Fast?</h2><p>The insight that makes <a href="https://github.com/parenstech/heretic">Heretic</a> practical: most mutations only need 2-3 tests.</p><p>When you mutate a single expression, you don&apos;t need your entire test suite. You need only the tests that exercise that expression. Usually that&apos;s a handful of tests, not hundreds.</p><p>The challenge is knowing which ones. Not just which functions they call, but which <em>subexpressions</em> they touch. The <code>+</code> inside <code>(if condition (+ a b) (* a b))</code> might be covered by different tests than the <code>*</code>.</p><p>Heretic builds this map using <a href="https://github.com/flow-storm/clojure">ClojureStorm</a>, the instrumented compiler behind <a href="https://github.com/flow-storm/flow-storm-debugger">FlowStorm</a>. Run your tests once under instrumentation. From then on, each mutation runs only the tests that actually touch that code.</p><p>Instead of running 200 tests per mutation, we run 2. Instead of hours, seconds.</p><h2 id="what-if-it-understood-clojure?">What If It Understood Clojure?</h2><p>Generic operators miss the bugs we actually make:</p><pre><code class="language-clojure">;; The mutation you want: threading macro confusion
(-&gt; data (get :users) first)     ; Original
(-&gt;&gt; data (get :users) first)    ; Mutant: wrong arg position, wrong result

;; The mutation you want: nil punning trap
(when (seq users) (map :name users))   ; Original (handles empty)
(when users (map :name users))         ; Mutant (breaks on empty list)

;; The mutation you want: destructuring gotcha
{:keys [user-id name]}           ; Original (kebab-case)
{:keys [userId name]}            ; Mutant (camelCase from JSON)
</code></pre><p>Heretic has 65+ mutation operators designed for Clojure idioms. Swap <code>first</code> for <code>last</code>. Change <code>rest</code> to <code>next</code>. Replace <code>-&gt;</code> with <code>some-&gt;</code>. Mutate qualified keywords. The mutations you see will be the bugs you recognize.</p><h2 id="what-if-it-could-think?">What If It Could Think?</h2><p>Here&apos;s a finding that should worry anyone relying on traditional mutation testing: <a href="https://research.chalmers.se/en/publication/536348">research shows</a> that nearly half of real-world faults have no strongly coupled traditional mutant. The bugs that escape to production aren&apos;t the ones that flip operators. They&apos;re the ones that invert business logic.</p><pre><code class="language-clojure">;; Traditional mutation: swap * for /
(* price 0.8)  --&gt;  (/ price 0.8)     ; Absurd. Nobody writes this bug.

;; Semantic mutation: invert the discount
(* price 0.8)  --&gt;  (* price 1.2)     ; Premium users pay MORE. Plausible bug.
</code></pre><p>A function called <code>apply-discount</code> should never increase the price. That&apos;s the invariant tests should verify. An AI can read function names, docstrings, and context to generate the mutations that <em>test whether your tests understand the code&apos;s purpose</em>.</p><p>This hybrid approach - fast deterministic mutations for the common cases, intelligent semantic mutations for the subtle ones - is where Heretic is heading. <a href="https://engineering.fb.com/2025/02/05/security/revolutionizing-software-testing-llm-powered-bug-catchers-meta-ach/">Meta&apos;s ACH system</a> proved the pattern works at industrial scale.</p><h2 id="why-&quot;heretic&quot;?">Why &quot;Heretic&quot;?</h2><p>Clojure discourages mutation. Values are immutable. State changes through controlled transitions. The design philosophy is that uncontrolled mutation leads to bugs.</p><p>So there&apos;s something a bit ironic about a tool that deliberately introduces mutations to find those bugs. We mutate your code to prove your tests would catch it if it happened accidentally - to verify that the discipline holds.</p><hr /><p>This is the first in a series on building Heretic. Upcoming posts will cover how ClojureStorm enables expression-level coverage mapping, how we use <a href="https://github.com/clj-commons/rewrite-clj">rewrite-clj</a> and <a href="https://github.com/tonsky/clj-reload">clj-reload</a> for hot-swapping mutants, and the optimization techniques that make this practical for real codebases.</p><p>If your coverage is high but bugs still slip through, you&apos;re measuring the wrong thing.</p></div>]]></content>
  </entry>
</feed>
