Transform Tool — Rebuild Specification

1. Confirmation Gate
2. URL Contract
3. Codec Registry
4. Pipeline Execution
5. Serialization
6. Visual Contract
7. Test Suite (URL → Expected Output)
8. Failure Handler
9. Formal Precision (from Dafny review)
10. Non-Functional Requirements
11. URL Parsing Clarification (revised)
12. Instrumentation
13. Spec Review History

Axiom: The URL is the sole source of truth. ?chain=rot13+base64:d&text=Hello fully determines the output. No server state. No session. Pure function from URL → rendered page.

1. Confirmation Gate

Before writing code, the implementing agent MUST state:

How many codecs are in the registry (answer: 24)
The three codec patterns and an example of each
What caesar:3:d produces on input HW WX EUXWH (answer: ET TU BRUTE)
Why rot13 does not need encode/decode but caesar:3 does

If any answer is wrong, stop and re-read this spec.

2. URL Contract

2.2. Query Parameters

Param	Required	Format	Example
`chain`	no	`+` delimited codec steps	`rot13+base64:d`
`text`	no	input string (spaces as `+` or `%20`)	`Hello+World`

Default (no params): input = Hello, World!, chain = identity encode.

2.3. Chain Syntax

Each step in the + delimited chain follows:

<codec-id>[:<arg1>[:<arg2>...]][:<direction>]

Direction tokens: d, decode, e, encode. Default: encode. Everything between the codec ID and the direction token is positional args.

Chain token	Codec	Args	Direction
`rot13`	rot13	—	encode (irrelevant: involution)
`base64:d`	base64	—	decode
`caesar:3`	caesar	[3]	encode (shift +3)
`caesar:3:d`	caesar	[3]	decode (shift +23 = -3 mod 26)
`xor:42:d`	xor	[42]	decode
`hex+sort`	hex, sort	—	encode, encode (two steps)

2.4. Chain Grammar

The chain parameter has a formal grammar. The three character classes (codec ID, arg, direction token) are disjoint by construction, which makes parsing unambiguous with no lookahead.

;; In the URL (before decode):
chain   = step ("+" step)*

;; After URL decode (+ becomes space):
chain   = step (" " step)*

;; Per-step grammar (same either way):
step    = id (":" arg)* (":d")?
id      = [a-z][a-z0-9-]*       ; lowercase alpha start, then alphanumeric or hyphen
arg     = <any segment that is not "d">
dir     = "d"                    ; decode. absence = encode (the default).

One reserved word: d. A trailing :d always means decode. It can never be an argument. Encode is the default and has no suffix.

That is the entire parsing rule. The last colon-segment is checked:

If it is exactly "d" → direction is decode, remove it from args
Otherwise → direction is encode, it stays as an arg

:e, :encode, and :decode are accepted on input for tolerance but never emitted. steps->str produces :d or nothing.

2.4.1. Why this is unambiguous

"d" is reserved. A codec arg can be anything except the bare string "d". If a future codec needs a literal d as an argument, it must use a different encoding (e.g., 100 for ASCII code, dd as an escape, or a longer form like del). The existing 24 codecs have only numeric args (caesar shift), so this costs nothing today.

2.5. Why `+` works: URL encoding does the parsing for you

In URL query strings, + encodes a space. Standard URL decoding (URLSearchParams, decodeURIComponent, url.parse, Ring, etc.) converts + to space before the application sees the value. This means:

URL:        ?chain=rot13+base64:d+caesar:3:d&text=Hello+World
URL decode: chain = "rot13 base64:d caesar:3:d"
            text  = "Hello World"

After URL decoding, the chain value is a space-separated list of step tokens. Every language's default string split works:

Language	Split expression	Result
Clojure	`(str/split s #"\s+")`	`["rot13" "base64:d" "caesar:3:d"]`
Python	`s.split()`	`["rot13", "base64:d", "caesar:3:d"]`
OCaml	`String.split_on_char ' ' s`	`["rot13"; "base64:d"; "caesar:3:d"]`
Rust	`s.split_whitespace()`	`["rot13", "base64:d", "caesar:3:d"]`
TypeScript	`s.split(/\s+/)`	`["rot13", "base64:d", "caesar:3:d"]`
Racket	`(string-split s)`	`'("rot13" "base64:d" "caesar:3:d")`
Haskell	`words s`	`["rot13", "base64:d", "caesar:3:d"]`

No custom delimiter logic needed. URLSearchParams is safe to use.

Canonical output: steps->str MUST emit + (which URL-encodes space). This keeps the URL readable: ?chain=rot13+base64:d looks like a pipeline, not ?chain=rot13%20base64%3Ad. Implementations that use URLSearchParams to construct URLs will naturally emit + for spaces in form encoding mode.

3. Codec Registry

24 codecs in three patterns.

3.1. Pattern 1: Involution (`:fn`)

f(f(x)) = x. One function. Direction is irrelevant. Verb: apply.

ID	Label	Notes
`identity`	identity	`f(x) = x`. The unit.
`reverse`	reverse	String reversal. Anti-homomorphism: `f(ab) = f(b)f(a)`.
`rot13`	rot13	Caesar shift 13. The only Caesar shift that is its own inverse.
`jump5`	jump5	Digit substitution: 0↔5 1↔6 2↔7 3↔8 4↔9. Non-digits pass through.
`atbash`	atbash	Mirror alphabet: A↔Z B↔Y C↔X. Hebrew cipher.

Registry shape: {:id :rot13 :label "rot13" :fn rot13 :involution? true :inverse? true}

3.2. Pattern 2: Bijection (`:encode` / `:decode`)

decode(encode(x)) = x but encode ≠ decode. Two separate functions. Verb: encode or decode.

ID	Label	Notes
`1337`	1337	Leet speak. A↔4 E↔3 I↔1 O↔0 S↔5 T↔7 B↔8 G↔6. Uppercase only.
`xor`	xor 0x42	XOR each byte with 0x42. Output is hex string.
`base64`	base64	RFC 4648. Uses `TextEncoder~/~TextDecoder` (NOT `btoa~/~encodeURIComponent`).
`url-encode`	url-encode	Percent-encoding per RFC 3986.
`hex`	hex	Each byte → 2-char hex.
`a1z26`	A1Z26	A=1 … Z=26. Spaces → `/`. Gravity Falls cipher.
`binary`	binary	Each char → 8-bit binary, space-separated.
`char-codes`	char-codes	Each char → decimal ordinal, space-separated.
`morse`	morse	ITU Morse code. Words separated by `/`.
`ipv4-int`	ipv4→int	Dotted-quad ↔ uint32. Domain-restricted.
`hamming`	hamming	Hamming(7,4). Input is hex. Corrects single-bit errors on decode.
`f-to-c`	°F→°C	`(f - 32) * 5/9`.
`c-to-f`	°C→°F	`c * 9/5 + 32`.
`c-to-k`	°C→K	`c + 273.15`.

Registry shape: {:id :base64 :label "base64" :encode encode-fn :decode decode-fn :inverse? true}

3.3. Pattern 3: Parameterized (`:make-fn`)

A factory function receives [args] dir and returns (string → string). The same arg means different things depending on direction.

ID	Label	Args	Encode	Decode
`caesar`	caesar	shift N (default 13)	`rot-n N`	`rot-n (26-N)`

Registry shape:

{:id :caesar :label "caesar"
 :make-fn (fn [args dir]
            (let [n (parse (first args))
                  shift (if (= dir :decode) (- 26 n) n)]
              #(rot-n shift %)))
 :inverse? true :involution? false}

Note: the shift N is applied mod 26, so every integer is a valid argument and no range check is needed — caesar:0 and caesar:26 are both the identity, caesar:13 is rot13, caesar:-1 equals caesar:25. The examples below cover the cases; the formal domain is left for the reimplementation to synthesize.

Case is preserved: [A-Z] shift within uppercase, [a-z] within lowercase, and every other character (digits, punctuation, spaces, non-ASCII) passes through unchanged. So rot13 applied to [VADER] No. I am your father. yields [INQRE] Ab. V nz lbhe sngure. — brackets, period, spaces, and case all intact.

caesar:13 is the only codec where direction changes the shift amount rather than selecting a different function.

3.4. Pattern 4: Surjection (no inverse)

Information is destroyed. Verb: apply. Decode is nil or cosmetic. Pipeline turns red/amber at this step.

ID	Label	Notes
`sort`	sort	Sort characters. `listen → eilnst`. No section exists.
`upper`	upper	`toUpperCase`. `decode` maps to `lower` for convenience but `lower(upper("Hello")) = "hello" ≠ "Hello"`.
`lower`	lower	`toLowerCase`. Same caveat.
`corrupt`	corrupt	Flip one random bit. Non-deterministic.

Registry shape: {:id :sort :label "sort" :encode sort-fn :decode nil :inverse? false}

4. Pipeline Execution

4.1. Threading Model

Identical to Clojure's (-> input f1 f2 f3).

Start with text param as initial value
For each step left-to-right, resolve the function and apply it
Record the intermediate value after each step (the trace)
Halt on first error

The trace is the core data structure — it drives both the visual overlay and the output.

4.2. Function Resolution

Given a step {:id :base64 :direction :encode :args []}:

Look up codec by :id
If codec has :make-fn: call (make-fn args direction) → function
If codec has :fn: use it (involution, direction irrelevant)
Otherwise: (get codec direction) → :encode or :decode function

4.3. Reversibility

A step is reversible when:

(and (:inverse? codec)
     (or (:involution? codec)      ; f = f⁻¹
         (:make-fn codec)          ; factory handles both directions
         (some? (get codec (opposite direction)))))  ; partner fn exists

A pipeline is reversible when every step is reversible.

4.4. Reverse Pipeline

To invert a pipeline:

Reverse the step order
Flip each step's direction (:encode ↔ :decode)
Involutions keep :encode (direction is irrelevant — they use :fn)
Surjections keep :encode (no :decode exists — best effort)
Preserve :args — parameterized codecs need their arguments

;; Forward:  [{:id :caesar :direction :encode :args ["3"]}]
;; Reversed: [{:id :caesar :direction :decode :args ["3"]}]
;; The "3" stays — make-fn interprets it as shift-23 when dir=:decode

Implementation note on involutions: In implementations using separate encode~/~decode fields (OCaml, Rust, TypeScript), an involution stores the same function in both fields. Flipping the direction is harmless — it still resolves to the same function. But in implementations using a single fn field (Clojure, Racket), the function resolution logic checks :fn first and ignores direction entirely. If the reverse-pipeline code sets direction to :decode on an involution, the resolution logic MUST still find the function. The safest approach: always set involution steps to :encode in the reversed pipeline (matching the canonical serialization).

Forward:   rot13(encode) + base64(encode) + hex(encode)
Reversed:  hex(decode) + base64(decode) + rot13(encode)
                                           ^^^^^^ encode, not decode

5. Serialization

5.1. `steps->str` (canonical form)

Pipeline → URL chain param:

Join steps with +
Each step: <id> (encode) or <id>:d (decode)
With args: <id>:<arg1>:<arg2> or <id>:<arg1>:d
Encode direction is the default — the :e suffix is DROPPED. rot13 means encode. rot13:d means decode. There is no rot13:e in canonical output.

5.2. `str->steps` (liberal parsing)

URL chain param → pipeline. The algorithm has two levels of splitting: an outer split on step delimiters and an inner split on the colon separator within each step.

5.2.1. Algorithm (language-neutral)

FUNCTION str_to_steps(chain_string) → list of steps:

  1. OUTER SPLIT: split chain_string on /[+, ]+/ (plus, comma, or space).
     Each resulting token is one step. Discard empty tokens.

  2. FOR EACH TOKEN:
     a. INNER SPLIT: split token on ":" into segments.
        segments[0]    = codec ID  (always present)
        segments[1..]  = tail      (may be empty)

     b. DIRECTION CHECK: inspect the LAST element of tail.
        If it is one of {"d", "decode", "e", "encode"}:
          → that element is the direction
          → everything between segments[0] and the direction is args
        Otherwise:
          → direction = encode (the default)
          → the entire tail is args

     c. EMIT step: {id, direction, args}

  RETURN list of steps

The direction tokens are: d, decode, e, encode. These four strings are reserved — they cannot be used as codec IDs or argument values. :e and :encode MUST be accepted as explicit encode direction, even though steps->str never emits them.

5.2.2. Worked examples

chain = "rot13+base64:d+hex"

  outer split → ["rot13", "base64:d", "hex"]

  "rot13"     → inner split → ["rot13"]
                tail = []          → no direction token → dir=encode, args=[]
                → {id="rot13", dir=encode, args=[]}

  "base64:d"  → inner split → ["base64", "d"]
                tail = ["d"]       → "d" is a direction token → dir=decode
                args = [] (nothing between id and direction)
                → {id="base64", dir=decode, args=[]}

  "hex"       → inner split → ["hex"]
                → {id="hex", dir=encode, args=[]}

chain = "caesar:3:d"

  outer split → ["caesar:3:d"]

  "caesar:3:d" → inner split → ["caesar", "3", "d"]
                 tail = ["3", "d"]
                 last of tail = "d" → direction token → dir=decode
                 args = ["3"]  (tail minus the last element)
                 → {id="caesar", dir=decode, args=["3"]}

chain = "caesar:3"

  outer split → ["caesar:3"]

  "caesar:3"   → inner split → ["caesar", "3"]
                 tail = ["3"]
                 last of tail = "3" → NOT a direction token → dir=encode
                 args = ["3"]  (entire tail)
                 → {id="caesar", dir=encode, args=["3"]}

chain = "xor:42:d"

  outer split → ["xor:42:d"]

  "xor:42:d"   → inner split → ["xor", "42", "d"]
                 tail = ["42", "d"]
                 last = "d" → direction token → dir=decode
                 args = ["42"]
                 → {id="xor", dir=decode, args=["42"]}

5.2.3. Common implementation errors

Global direction: checking for :d at the end of the entire chain string rather than per-step. This makes rot13+base64:d set decode on all steps, not just base64. (Observed in the OCaml implementation.)
No inner split: splitting only on + without splitting each token on :. This makes caesar:3:d appear as a single opaque string with no args and no direction. (Also OCaml.)
Args as direction: not checking the last segment specifically. If caesar:3 is parsed as {id="caesar", dir=3} instead of {id="caesar", args=["3"]}, the direction token check is wrong.

5.2.4. Cross-implementation reference

The algorithm above matches all four working implementations:

Language	Function	Inner split	Direction check
Clojure	`str->steps`	`split token ":"`	`(contains? direction-tokens last)`
TypeScript	`strToSteps`	`token.split(":")`	`DIRECTION_TOKENS.has(last)`
Rust	`str_to_steps`	`token.split(':')`	`match *last { "d" \vert "decode" ... }`
Racket	`str->steps`	`string-split ":"`	`(member last '("d" "decode" ...))`

5.3. Canonicalization invariant

str->steps(steps->str(pipeline)) = pipeline for all valid pipelines. The reverse does NOT hold: steps->str(str->steps("rot13:e")) = "rot13" (the explicit :e is normalized away). Implementations MUST accept both forms on input but produce only canonical form on output.

6. Visual Contract

The UI renders:

Threading form: (-> input rot13 base64/decode) — Clojure syntax
Reversibility badge: green REVERSIBLE or amber ONE-WAY
Input: textarea with source buttons (Hello World, Date.now(), etc.)
Chain: vertical pipeline with intermediate values after each arrow
Step node: codec name, direction tag, intermediate value, flip/remove buttons
Add step: dropdown (all 24 codecs) + direction dropdown + blue "add" button
Reverse button: always visible. Green when reversible, amber when not.
Output: dark block with final value, copy button, shareable link button

Direction dropdown shows:

apply: for involutions (direction irrelevant) and surjections (no inverse)
encode / decode: for bijections and parameterized codecs

Surjections: decode option is disabled in the direction dropdown.

7. Test Suite (URL → Expected Output)

These URLs are the acceptance tests. The output is the final value after threading through the chain.

URL chain+text	Expected output
`chain=rot13&text=V+NZ+LBHE+SNGURE`	`I AM YOUR FATHER`
`chain=rot13&text=PNXR+VF+N+YVR`	`CAKE IS A LIE`
`chain=rot13&text=[VADER] No. I am your father.`	`[INQRE] Ab. V nz lbhe sngure.` (case + punctuation preserved)
`chain=caesar:26&text=Hello, World!`	`Hello, World!` (identity, mod 26)
`chain=caesar:0&text=Hello, World!`	`Hello, World!` (identity)
`chain=reverse&text=REDRUM`	`MURDER`
`chain=atbash&text=SVOOL`	`HELLO`
`chain=base64:d&text=TmV2ZXIgZ29ubmEgZ2l2ZSB5b3UgdXA=`	`Never gonna give you up`
`chain=caesar:3:d&text=HW+WX+EUXWH`	`ET TU BRUTE`
`chain=caesar:3&text=ET+TU+BRUTE`	`HW WX EUXWH`
`chain=hex&text=Hello`	`48656c6c6f`
`chain=hex:d&text=48656c6c6f`	`Hello`
`chain=rot13+base64&text=Hello`	`Uryyb` then `VXJ5eWI=`
`chain=upper&text=Hello`	`HELLO`
`chain=upper:d&text=HELLO`	`hello` (lossy: not a true inverse)
`chain=identity&text=anything`	`anything`
(no params)	input: `Hello, World!`, chain: identity

7.1. XOR test vectors

Chain + text	Expected
`chain=xor&text=Hi`	`0a2b` (H=0x48 XOR 0x42=0x0a; i=0x69 XOR 0x42=0x2b)
`chain=xor:d&text=0a2b`	`Hi`
`chain=xor+xor:d&text=Hello`	`Hello` (round-trip)

7.2. Per-step direction test vectors

Chain + text	Expected
`chain=rot13+base64:d&text=VXJ5eWI=`	step 1: rot13 encode on `VXJ5eWI=` → `IKW5lJV=`; step 2: base64 decode on `IKW5lJV=` → error (invalid base64)
`chain=hex+rot13&text=Hi`	step 1: hex encode → `4869`; step 2: rot13 on `4869` → `4869` (digits untouched)

7.3. Reversibility Tests

Chain	Reversible?	Why
`rot13`	yes	involution
`base64`	yes	bijection
`caesar:3`	yes	parameterized, `:inverse? true`
`rot13+base64`	yes	all steps reversible
`hex+sort`	no	sort is surjection
`upper`	no	`:inverse? false`
`identity+identity+identity`	yes	all identity

7.4. Round-Trip Tests

For every reversible chain, threading input through the chain and then through the reversed chain MUST return the original input:

(let [steps  (str->steps "caesar:3")
      result (thread "ET TU BRUTE" steps)        ; → "HW WX EUXWH"
      rev    (reverse-pipeline steps)
      back   (thread "HW WX EUXWH" rev)]         ; → "ET TU BRUTE"
  (assert (= "ET TU BRUTE" (:value (last back)))))

8. Failure Handler

If any test in the suite above fails:

STOP — do not proceed to UI implementation
Report: which test, expected vs actual, the codec involved
The most likely failure modes:
- make-fn ignoring direction (caesar decode = encode)
- reverse-pipeline dropping :args (parameterized codecs lose config)
- step-reversible? not recognizing :make-fn pattern
- base64 using btoa(encodeURIComponent(s)) instead of TextEncoder
- Not URL-decoding the chain param before splitting (raw + instead of space)

9. Formal Precision (from Dafny review)

The following clarifications were added after a Dafny agent attempted to formalize this spec as ensures / requires clauses (aygp-dr/transform-tool-dafny).

This spec stays example-driven on purpose: we would rather an implementing team synthesize the requirements from the worked examples (the 1 and the 7) than read a wall of requires clauses. If a team wants a machine-checked formalism, it belongs in their artifact — in Lean 4 or Dafny — not in this document. Two verified reference formalizations exist to crib from; both are non-normative:

Lean 4 — spec/codecproofs/ (wal.sh repo): Codec / Equiv, the involutions, composition, base64 core.
Dafny — jwalsh/dafny2026-reversible-transducers: codec-as-transducer, reverse involution, Caesar round-trip, chain composition, and the alphabet boundary.

The clarifications below are examples of the gaps that formalizing surfaced, written back as prose + test vectors — not as a requirements language.

9.1. Structural vs semantic reversibility

step-reversible? checks registry flags (:inverse? true), not runtime behavior. A codec could claim :inverse? true while its decode function is broken. Implementations MUST NOT rely on the flag alone for round-trip guarantees. The flag is a UI hint for the badge; the actual round-trip property is a semantic invariant that the spec asserts but does not prove.

9.2. Domain predicates

Several codecs have restricted domains. The spec now requires these predicates for decode direction:

Codec	Domain predicate for decode
`hex`	even-length string of `[0-9a-fA-F]`
`base64`	valid RFC 4648 alphabet + padding
`ipv4-int`	integer string in `[0, 4294967295]`
`hamming`	binary string with length divisible by 7
`binary`	space-separated 8-bit binary strings
`char-codes`	space-separated decimal integers

Attempting to decode invalid input MUST produce an error in the trace, not undefined behavior.

9.3. Character model

The spec assumes strings are sequences of Unicode code points. Implementations in UTF-16 languages (JS, Java) and UTF-8 languages (Rust, Go) MUST agree on output for ASCII input. Non-ASCII behavior is implementation-defined and should be documented per-codec.

9.4. Caesar fixed points

caesar:N where encode = decode (involution) holds for N ≡ 0 and N ≡ 13 (mod 26). N ≡ 0 is the identity (caesar:0, caesar:26, caesar:-26 …). N ≡ 13 is rot13. The unique non-trivial fixed point among N ∈ {1,…,25} is N = 13.

9.5. XOR codec representation

xor encode and decode are NOT the same function. Although XOR is mathematically self-inverse (x XOR k XOR k = x), the codec's encode and decode operate at different levels:

Encode: XOR each byte of the input with key 0x42, then represent each result byte as a 2-character lowercase hex string. Input: ASCII string. Output: hex string (twice the length).
Decode: Parse the hex string into byte pairs, XOR each with 0x42, and reconstruct the ASCII string. Input: hex string. Output: ASCII string.

This means xor is a bijection (Pattern 2), not an involution (Pattern 1), even though the underlying XOR operation is self-inverse. The hex representation makes encode ≠ decode at the string level.

Reference implementation (Clojure):

(defn- xor-encode [s]
  ;; "Hi" → XOR each char with 0x42 → emit as hex pairs
  ;; H=0x48, 0x48 XOR 0x42 = 0x0a → "0a"
  ;; i=0x69, 0x69 XOR 0x42 = 0x2b → "2b"
  ;; result: "0a2b"
  ...)
(defn- xor-decode [s]
  ;; "0a2b" → parse hex pairs → XOR with 0x42 → chars
  ;; 0x0a XOR 0x42 = 0x48 = H
  ;; 0x2b XOR 0x42 = 0x69 = i
  ;; result: "Hi"
  ...)

Implementations that treat XOR as an involution (same function for both directions, operating on raw bytes) will fail the test vectors because the output representation differs.

9.6. 1337 (leet speak) substitution scope

The substitution table A↔4 E↔3 I↔1 O↔0 S↔5 T↔7 B↔8 G↔6 applies to uppercase letters only. Lowercase passes through unchanged. This means case is preserved and the codec is a true bijection on the mixed-case domain:

encode("BEAST")  → "83457"
encode("beast")  → "beast"    (lowercase untouched)
encode("Beast")  → "8east"    (only the B is uppercase, so only B→8)
decode("83457")  → "BEAST"
decode("beast")  → "beast"
decode("8east")  → "Beast"

9.7. Hex output case

hex encode MUST produce lowercase hex digits (48656c6c6f, not 48656C6C6F). Decode MUST accept both cases ([0-9a-fA-F]). This is consistent with the test vector: chain=hex&text=Hello → 48656c6c6f.

9.8. Decode failure convention

When decode receives input outside its domain predicate (e.g., hex:d on an odd-length string), the codec MUST signal failure. The representation of failure is language-specific:

Language	Convention
Clojure	throw `ex-info` (caught by `apply-step`)
TypeScript	throw `Error` (caught by try/catch)
Rust	`Result<String, CodecError>`
OCaml	`string option` (`None` on failure)
Racket	raise `exn:fail:contract`
Dafny	`requires` clause prevents invalid call

The pipeline execution layer MUST catch the failure and produce a trace entry with error set and output equal to the unchanged input. Pipeline execution halts after the first error.

9.9. rot13 character ranges

rot13 shifts ASCII letters and passes everything else through unchanged. The exact ranges:

A-M (0x41-0x4D) → shift +13 → N-Z (0x4E-0x5A)
N-Z (0x4E-0x5A) → shift -13 → A-M (0x41-0x4D)
a-m (0x61-0x6D) → shift +13 → n-z (0x6E-0x7A)
n-z (0x6E-0x7A) → shift -13 → a-m (0x61-0x6D)
Everything else: pass through unchanged

Case is preserved. Digits, punctuation, spaces, and non-ASCII are untouched.

9.10. Non-determinism

corrupt is the only non-deterministic codec. Pipelines containing corrupt violate the purity axiom ("URL → deterministic output"). The spec's axiom applies to all pipelines EXCEPT those containing corrupt. Implementations SHOULD seed the RNG from the input for reproducibility in tests, or exclude corrupt from round-trip assertions.

9.11. Surjection terminology

The spec uses "surjection" loosely. Formally, sort is non-injective (many inputs map to one output) but whether it is surjective depends on the codomain definition. The spec means: information-destroying, no left-inverse exists. The UI label ONE-WAY is more precise than the mathematical term.

10. Non-Functional Requirements

No server: pure client-side. The URL is the API.
No framework: vanilla DOM or a compile-to-JS language (ClojureScript, Elm, etc.)
CSS: monospace font (IBM Plex Mono), circuit-trace aesthetic. Green (#059669) = reversible. Amber (#d97706) = warning. Red (#dc2626) = error.
#tool-root: min-height: 675px to prevent layout shift during hydration.
URL sync: typing updates the URL. Loading a URL hydrates the state. No %2B.
localStorage: save/load named chains. Keys: transform-chains (JSON object mapping name → chain string), transform-customer-id (UUID v4, created on first visit).

11. URL Parsing Clarification (revised)

+ is space in URL query encoding. This is not overloaded — it is the same thing in both the chain and text params:

?text=Hello+World → text = "Hello World" ← input string
?chain=rot13+base64:d → chain = "rot13 base64:d" ← space-separated steps

Standard URL parsing (URLSearchParams, url.parse, Ring middleware, Rack, etc.) decodes + as space. After decoding, the chain value is a plain space-separated string. Split on whitespace to get step tokens.

Previous guidance (from the TS review on 2026-06-07) said to avoid URLSearchParams and parse the raw query string manually. That advice was wrong — it fought the web platform instead of using it. The + delimiter was always URL-encoded space. Embrace it.

12. Instrumentation

Each of these MUST be observable in the browser console or via the trace:

Input value at each step boundary
Whether each step succeeded or errored
Whether the pipeline is reversible (and which step breaks it if not)
The serialized chain string matching the URL
Round-trip identity: for reversible chains, reverse(forward(x)) = x

13. Spec Review History

Date	Agent	Language	Tests	Issues	Key finding
2026-06-07	builder	TypeScript	40/40	9	wrong test vector (`VXJ5Yg==`)
2026-06-07	lambda	Dafny	—	9	structural ≠ semantic reversibility
2026-06-07	rust-pro	Rust/WASM	37/37	10	UTF-8 vs UTF-16 charCodeAt — "spec lie"
2026-06-07	zero	Racket	33/33	7	contracts enforce round-trip at boundary, not in tests
2026-06-10	builder	OCaml	—	6	chain parser `:d` per-chain not per-step; XOR as involution not bijection; reverse_pipeline flips involutions