Transform Tool — Rebuild Specification
Table of Contents
- 1. Confirmation Gate
- 2. URL Contract
- 3. Codec Registry
- 4. Pipeline Execution
- 5. Serialization
- 6. Visual Contract
- 7. Test Suite (URL → Expected Output)
- 8. Failure Handler
- 9. Formal Precision (from Dafny review)
- 9.1. Structural vs semantic reversibility
- 9.2. Domain predicates
- 9.3. Character model
- 9.4. Caesar fixed points
- 9.5. XOR codec representation
- 9.6. 1337 (leet speak) substitution scope
- 9.7. Hex output case
- 9.8. Decode failure convention
- 9.9. rot13 character ranges
- 9.10. Non-determinism
- 9.11. Surjection terminology
- 10. Non-Functional Requirements
- 11. URL Parsing Clarification (revised)
- 12. Instrumentation
- 13. Spec Review History
Axiom: The URL is the sole source of truth. ?chain=rot13+base64:d&text=Hello
fully determines the output. No server state. No session. Pure function from
URL → rendered page.
1. Confirmation Gate
Before writing code, the implementing agent MUST state:
- How many codecs are in the registry (answer: 24)
- The three codec patterns and an example of each
- What
caesar:3:dproduces on inputHW WX EUXWH(answer:ET TU BRUTE) - Why
rot13does not need encode/decode butcaesar:3does
If any answer is wrong, stop and re-read this spec.
2. URL Contract
2.1. Base URL
https://wal.sh/tools/transform/
2.2. Query Parameters
| Param | Required | Format | Example |
|---|---|---|---|
chain |
no | + delimited codec steps |
rot13+base64:d |
text |
no | input string (spaces as + or %20) |
Hello+World |
Default (no params): input = Hello, World!, chain = identity encode.
2.3. Chain Syntax
Each step in the + delimited chain follows:
<codec-id>[:<arg1>[:<arg2>...]][:<direction>]
Direction tokens: d, decode, e, encode. Default: encode.
Everything between the codec ID and the direction token is positional args.
| Chain token | Codec | Args | Direction |
|---|---|---|---|
rot13 |
rot13 | — | encode (irrelevant: involution) |
base64:d |
base64 | — | decode |
caesar:3 |
caesar | [3] | encode (shift +3) |
caesar:3:d |
caesar | [3] | decode (shift +23 = -3 mod 26) |
xor:42:d |
xor | [42] | decode |
hex+sort |
hex, sort | — | encode, encode (two steps) |
2.4. Chain Grammar
The chain parameter has a formal grammar. The three character classes (codec ID, arg, direction token) are disjoint by construction, which makes parsing unambiguous with no lookahead.
;; In the URL (before decode):
chain = step ("+" step)*
;; After URL decode (+ becomes space):
chain = step (" " step)*
;; Per-step grammar (same either way):
step = id (":" arg)* (":d")?
id = [a-z][a-z0-9-]* ; lowercase alpha start, then alphanumeric or hyphen
arg = <any segment that is not "d">
dir = "d" ; decode. absence = encode (the default).
One reserved word: d. A trailing :d always means decode. It can
never be an argument. Encode is the default and has no suffix.
That is the entire parsing rule. The last colon-segment is checked:
- If it is exactly
"d"→ direction is decode, remove it from args - Otherwise → direction is encode, it stays as an arg
:e, :encode, and :decode are accepted on input for tolerance but
never emitted. steps->str produces :d or nothing.
2.4.1. Why this is unambiguous
"d" is reserved. A codec arg can be anything except the bare string
"d". If a future codec needs a literal d as an argument, it must
use a different encoding (e.g., 100 for ASCII code, dd as an escape,
or a longer form like del). The existing 24 codecs have only numeric
args (caesar shift), so this costs nothing today.
2.5. Why + works: URL encoding does the parsing for you
In URL query strings, + encodes a space. Standard URL decoding
(URLSearchParams, decodeURIComponent, url.parse, Ring, etc.)
converts + to space before the application sees the value. This means:
URL: ?chain=rot13+base64:d+caesar:3:d&text=Hello+World
URL decode: chain = "rot13 base64:d caesar:3:d"
text = "Hello World"
After URL decoding, the chain value is a space-separated list of step tokens. Every language's default string split works:
| Language | Split expression | Result |
|---|---|---|
| Clojure | (str/split s #"\s+") |
["rot13" "base64:d" "caesar:3:d"] |
| Python | s.split() |
["rot13", "base64:d", "caesar:3:d"] |
| OCaml | String.split_on_char ' ' s |
["rot13"; "base64:d"; "caesar:3:d"] |
| Rust | s.split_whitespace() |
["rot13", "base64:d", "caesar:3:d"] |
| TypeScript | s.split(/\s+/) |
["rot13", "base64:d", "caesar:3:d"] |
| Racket | (string-split s) |
'("rot13" "base64:d" "caesar:3:d") |
| Haskell | words s |
["rot13", "base64:d", "caesar:3:d"] |
No custom delimiter logic needed. URLSearchParams is safe to use.
Canonical output: steps->str MUST emit + (which URL-encodes space).
This keeps the URL readable: ?chain=rot13+base64:d looks like a pipeline,
not ?chain=rot13%20base64%3Ad. Implementations that use URLSearchParams
to construct URLs will naturally emit + for spaces in form encoding mode.
3. Codec Registry
24 codecs in three patterns.
3.1. Pattern 1: Involution (:fn)
f(f(x)) = x. One function. Direction is irrelevant. Verb: apply.
| ID | Label | Notes |
|---|---|---|
identity |
identity | f(x) = x. The unit. |
reverse |
reverse | String reversal. Anti-homomorphism: f(ab) = f(b)f(a). |
rot13 |
rot13 | Caesar shift 13. The only Caesar shift that is its own inverse. |
jump5 |
jump5 | Digit substitution: 0↔5 1↔6 2↔7 3↔8 4↔9. Non-digits pass through. |
atbash |
atbash | Mirror alphabet: A↔Z B↔Y C↔X. Hebrew cipher. |
Registry shape: {:id :rot13 :label "rot13" :fn rot13 :involution? true :inverse? true}
3.2. Pattern 2: Bijection (:encode / :decode)
decode(encode(x)) = x but encode ≠ decode. Two separate functions.
Verb: encode or decode.
| ID | Label | Notes |
|---|---|---|
1337 |
1337 | Leet speak. A↔4 E↔3 I↔1 O↔0 S↔5 T↔7 B↔8 G↔6. Uppercase only. |
xor |
xor 0x42 | XOR each byte with 0x42. Output is hex string. |
base64 |
base64 | RFC 4648. Uses TextEncoder~/~TextDecoder (NOT btoa~/~encodeURIComponent). |
url-encode |
url-encode | Percent-encoding per RFC 3986. |
hex |
hex | Each byte → 2-char hex. |
a1z26 |
A1Z26 | A=1 … Z=26. Spaces → /. Gravity Falls cipher. |
binary |
binary | Each char → 8-bit binary, space-separated. |
char-codes |
char-codes | Each char → decimal ordinal, space-separated. |
morse |
morse | ITU Morse code. Words separated by /. |
ipv4-int |
ipv4→int | Dotted-quad ↔ uint32. Domain-restricted. |
hamming |
hamming | Hamming(7,4). Input is hex. Corrects single-bit errors on decode. |
f-to-c |
°F→°C | (f - 32) * 5/9. |
c-to-f |
°C→°F | c * 9/5 + 32. |
c-to-k |
°C→K | c + 273.15. |
Registry shape: {:id :base64 :label "base64" :encode encode-fn :decode decode-fn :inverse? true}
3.3. Pattern 3: Parameterized (:make-fn)
A factory function receives [args] dir and returns (string → string).
The same arg means different things depending on direction.
| ID | Label | Args | Encode | Decode |
|---|---|---|---|---|
caesar |
caesar | shift N (default 13) | rot-n N |
rot-n (26-N) |
Registry shape:
{:id :caesar :label "caesar"
:make-fn (fn [args dir]
(let [n (parse (first args))
shift (if (= dir :decode) (- 26 n) n)]
#(rot-n shift %)))
:inverse? true :involution? false}
Note: the shift N is applied mod 26, so every integer is a valid argument and
no range check is needed — caesar:0 and caesar:26 are both the identity,
caesar:13 is rot13, caesar:-1 equals caesar:25. The examples below cover
the cases; the formal domain is left for the reimplementation to synthesize.
Case is preserved: [A-Z] shift within uppercase, [a-z] within lowercase,
and every other character (digits, punctuation, spaces, non-ASCII) passes
through unchanged. So rot13 applied to [VADER] No. I am your father. yields
[INQRE] Ab. V nz lbhe sngure. — brackets, period, spaces, and case all intact.
caesar:13 is the only codec where direction changes the shift amount rather
than selecting a different function.
3.4. Pattern 4: Surjection (no inverse)
Information is destroyed. Verb: apply. Decode is nil or cosmetic. Pipeline turns red/amber at this step.
| ID | Label | Notes |
|---|---|---|
sort |
sort | Sort characters. listen → eilnst. No section exists. |
upper |
upper | toUpperCase. decode maps to lower for convenience but lower(upper("Hello")) = "hello" ≠ "Hello". |
lower |
lower | toLowerCase. Same caveat. |
corrupt |
corrupt | Flip one random bit. Non-deterministic. |
Registry shape: {:id :sort :label "sort" :encode sort-fn :decode nil :inverse? false}
4. Pipeline Execution
4.1. Threading Model
Identical to Clojure's (-> input f1 f2 f3).
- Start with
textparam as initial value - For each step left-to-right, resolve the function and apply it
- Record the intermediate value after each step (the trace)
- Halt on first error
The trace is the core data structure — it drives both the visual overlay and the output.
4.2. Function Resolution
Given a step {:id :base64 :direction :encode :args []}:
- Look up codec by
:id - If codec has
:make-fn: call(make-fn args direction)→ function - If codec has
:fn: use it (involution, direction irrelevant) - Otherwise:
(get codec direction)→:encodeor:decodefunction
4.3. Reversibility
A step is reversible when:
(and (:inverse? codec) (or (:involution? codec) ; f = f⁻¹ (:make-fn codec) ; factory handles both directions (some? (get codec (opposite direction))))) ; partner fn exists
A pipeline is reversible when every step is reversible.
4.4. Reverse Pipeline
To invert a pipeline:
- Reverse the step order
- Flip each step's direction (
:encode ↔ :decode) - Involutions keep
:encode(direction is irrelevant — they use:fn) - Surjections keep
:encode(no:decodeexists — best effort) - Preserve :args — parameterized codecs need their arguments
;; Forward: [{:id :caesar :direction :encode :args ["3"]}] ;; Reversed: [{:id :caesar :direction :decode :args ["3"]}] ;; The "3" stays — make-fn interprets it as shift-23 when dir=:decode
Implementation note on involutions: In implementations using separate
encode~/~decode fields (OCaml, Rust, TypeScript), an involution stores the
same function in both fields. Flipping the direction is harmless — it still
resolves to the same function. But in implementations using a single fn
field (Clojure, Racket), the function resolution logic checks :fn first
and ignores direction entirely. If the reverse-pipeline code sets direction
to :decode on an involution, the resolution logic MUST still find the
function. The safest approach: always set involution steps to :encode in
the reversed pipeline (matching the canonical serialization).
Forward: rot13(encode) + base64(encode) + hex(encode)
Reversed: hex(decode) + base64(decode) + rot13(encode)
^^^^^^ encode, not decode
5. Serialization
5.1. steps->str (canonical form)
Pipeline → URL chain param:
- Join steps with
+ - Each step:
<id>(encode) or<id>:d(decode) - With args:
<id>:<arg1>:<arg2>or<id>:<arg1>:d - Encode direction is the default — the
:esuffix is DROPPED.rot13means encode.rot13:dmeans decode. There is norot13:ein canonical output.
5.2. str->steps (liberal parsing)
URL chain param → pipeline. The algorithm has two levels of splitting: an outer split on step delimiters and an inner split on the colon separator within each step.
5.2.1. Algorithm (language-neutral)
FUNCTION str_to_steps(chain_string) → list of steps:
1. OUTER SPLIT: split chain_string on /[+, ]+/ (plus, comma, or space).
Each resulting token is one step. Discard empty tokens.
2. FOR EACH TOKEN:
a. INNER SPLIT: split token on ":" into segments.
segments[0] = codec ID (always present)
segments[1..] = tail (may be empty)
b. DIRECTION CHECK: inspect the LAST element of tail.
If it is one of {"d", "decode", "e", "encode"}:
→ that element is the direction
→ everything between segments[0] and the direction is args
Otherwise:
→ direction = encode (the default)
→ the entire tail is args
c. EMIT step: {id, direction, args}
RETURN list of steps
The direction tokens are: d, decode, e, encode. These four
strings are reserved — they cannot be used as codec IDs or argument
values. :e and :encode MUST be accepted as explicit encode direction,
even though steps->str never emits them.
5.2.2. Worked examples
chain = "rot13+base64:d+hex"
outer split → ["rot13", "base64:d", "hex"]
"rot13" → inner split → ["rot13"]
tail = [] → no direction token → dir=encode, args=[]
→ {id="rot13", dir=encode, args=[]}
"base64:d" → inner split → ["base64", "d"]
tail = ["d"] → "d" is a direction token → dir=decode
args = [] (nothing between id and direction)
→ {id="base64", dir=decode, args=[]}
"hex" → inner split → ["hex"]
→ {id="hex", dir=encode, args=[]}
chain = "caesar:3:d"
outer split → ["caesar:3:d"]
"caesar:3:d" → inner split → ["caesar", "3", "d"]
tail = ["3", "d"]
last of tail = "d" → direction token → dir=decode
args = ["3"] (tail minus the last element)
→ {id="caesar", dir=decode, args=["3"]}
chain = "caesar:3"
outer split → ["caesar:3"]
"caesar:3" → inner split → ["caesar", "3"]
tail = ["3"]
last of tail = "3" → NOT a direction token → dir=encode
args = ["3"] (entire tail)
→ {id="caesar", dir=encode, args=["3"]}
chain = "xor:42:d"
outer split → ["xor:42:d"]
"xor:42:d" → inner split → ["xor", "42", "d"]
tail = ["42", "d"]
last = "d" → direction token → dir=decode
args = ["42"]
→ {id="xor", dir=decode, args=["42"]}
5.2.3. Common implementation errors
- Global direction: checking for
:dat the end of the entire chain string rather than per-step. This makesrot13+base64:dset decode on all steps, not just base64. (Observed in the OCaml implementation.) - No inner split: splitting only on
+without splitting each token on:. This makescaesar:3:dappear as a single opaque string with no args and no direction. (Also OCaml.) - Args as direction: not checking the last segment specifically. If
caesar:3is parsed as{id="caesar", dir=3}instead of{id="caesar", args=["3"]}, the direction token check is wrong.
5.2.4. Cross-implementation reference
The algorithm above matches all four working implementations:
| Language | Function | Inner split | Direction check |
|---|---|---|---|
| Clojure | str->steps |
split token ":" |
(contains? direction-tokens last) |
| TypeScript | strToSteps |
token.split(":") |
DIRECTION_TOKENS.has(last) |
| Rust | str_to_steps |
token.split(':') |
match *last { "d" \vert "decode" ... } |
| Racket | str->steps |
string-split ":" |
(member last '("d" "decode" ...)) |
5.3. Canonicalization invariant
str->steps(steps->str(pipeline)) = pipeline for all valid pipelines.
The reverse does NOT hold: steps->str(str->steps("rot13:e")) = "rot13"
(the explicit :e is normalized away). Implementations MUST accept both
forms on input but produce only canonical form on output.
6. Visual Contract
The UI renders:
- Threading form:
(-> input rot13 base64/decode)— Clojure syntax - Reversibility badge: green REVERSIBLE or amber ONE-WAY
- Input: textarea with source buttons (Hello World, Date.now(), etc.)
- Chain: vertical pipeline with intermediate values after each arrow
- Step node: codec name, direction tag, intermediate value, flip/remove buttons
- Add step: dropdown (all 24 codecs) + direction dropdown + blue "add" button
- Reverse button: always visible. Green when reversible, amber when not.
- Output: dark block with final value, copy button, shareable link button
Direction dropdown shows:
- apply: for involutions (direction irrelevant) and surjections (no inverse)
- encode / decode: for bijections and parameterized codecs
Surjections: decode option is disabled in the direction dropdown.
7. Test Suite (URL → Expected Output)
These URLs are the acceptance tests. The output is the final value after threading through the chain.
| URL chain+text | Expected output |
|---|---|
chain=rot13&text=V+NZ+LBHE+SNGURE |
I AM YOUR FATHER |
chain=rot13&text=PNXR+VF+N+YVR |
CAKE IS A LIE |
chain=rot13&text=[VADER] No. I am your father. |
[INQRE] Ab. V nz lbhe sngure. (case + punctuation preserved) |
chain=caesar:26&text=Hello, World! |
Hello, World! (identity, mod 26) |
chain=caesar:0&text=Hello, World! |
Hello, World! (identity) |
chain=reverse&text=REDRUM |
MURDER |
chain=atbash&text=SVOOL |
HELLO |
chain=base64:d&text=TmV2ZXIgZ29ubmEgZ2l2ZSB5b3UgdXA= |
Never gonna give you up |
chain=caesar:3:d&text=HW+WX+EUXWH |
ET TU BRUTE |
chain=caesar:3&text=ET+TU+BRUTE |
HW WX EUXWH |
chain=hex&text=Hello |
48656c6c6f |
chain=hex:d&text=48656c6c6f |
Hello |
chain=rot13+base64&text=Hello |
Uryyb then VXJ5eWI= |
chain=upper&text=Hello |
HELLO |
chain=upper:d&text=HELLO |
hello (lossy: not a true inverse) |
chain=identity&text=anything |
anything |
| (no params) | input: Hello, World!, chain: identity |
7.1. XOR test vectors
| Chain + text | Expected |
|---|---|
chain=xor&text=Hi |
0a2b (H=0x48 XOR 0x42=0x0a; i=0x69 XOR 0x42=0x2b) |
chain=xor:d&text=0a2b |
Hi |
chain=xor+xor:d&text=Hello |
Hello (round-trip) |
7.2. Per-step direction test vectors
| Chain + text | Expected |
|---|---|
chain=rot13+base64:d&text=VXJ5eWI= |
step 1: rot13 encode on VXJ5eWI= → IKW5lJV=; step 2: base64 decode on IKW5lJV= → error (invalid base64) |
chain=hex+rot13&text=Hi |
step 1: hex encode → 4869; step 2: rot13 on 4869 → 4869 (digits untouched) |
7.3. Reversibility Tests
| Chain | Reversible? | Why |
|---|---|---|
rot13 |
yes | involution |
base64 |
yes | bijection |
caesar:3 |
yes | parameterized, :inverse? true |
rot13+base64 |
yes | all steps reversible |
hex+sort |
no | sort is surjection |
upper |
no | :inverse? false |
identity+identity+identity |
yes | all identity |
7.4. Round-Trip Tests
For every reversible chain, threading input through the chain and then through the reversed chain MUST return the original input:
(let [steps (str->steps "caesar:3") result (thread "ET TU BRUTE" steps) ; → "HW WX EUXWH" rev (reverse-pipeline steps) back (thread "HW WX EUXWH" rev)] ; → "ET TU BRUTE" (assert (= "ET TU BRUTE" (:value (last back)))))
8. Failure Handler
If any test in the suite above fails:
- STOP — do not proceed to UI implementation
- Report: which test, expected vs actual, the codec involved
- The most likely failure modes:
make-fnignoring direction (caesar decode = encode)reverse-pipelinedropping:args(parameterized codecs lose config)step-reversible?not recognizing:make-fnpatternbase64usingbtoa(encodeURIComponent(s))instead ofTextEncoder- Not URL-decoding the chain param before splitting (raw
+instead of space)
9. Formal Precision (from Dafny review)
The following clarifications were added after a Dafny agent attempted to
formalize this spec as ensures / requires clauses
(aygp-dr/transform-tool-dafny).
This spec stays example-driven on purpose: we would rather an implementing
team synthesize the requirements from the worked examples (the
1 and the 7) than read a
wall of requires clauses. If a team wants a machine-checked formalism, it
belongs in their artifact — in Lean 4 or Dafny — not in this document. Two
verified reference formalizations exist to crib from; both are non-normative:
- Lean 4 —
spec/codecproofs/(wal.sh repo):Codec/Equiv, the involutions, composition, base64 core. - Dafny — jwalsh/dafny2026-reversible-transducers: codec-as-transducer,
reverseinvolution, Caesar round-trip, chain composition, and the alphabet boundary.
The clarifications below are examples of the gaps that formalizing surfaced, written back as prose + test vectors — not as a requirements language.
9.1. Structural vs semantic reversibility
step-reversible? checks registry flags (:inverse? true), not runtime
behavior. A codec could claim :inverse? true while its decode function is
broken. Implementations MUST NOT rely on the flag alone for round-trip
guarantees. The flag is a UI hint for the badge; the actual round-trip
property is a semantic invariant that the spec asserts but does not prove.
9.2. Domain predicates
Several codecs have restricted domains. The spec now requires these predicates for decode direction:
| Codec | Domain predicate for decode |
|---|---|
hex |
even-length string of [0-9a-fA-F] |
base64 |
valid RFC 4648 alphabet + padding |
ipv4-int |
integer string in [0, 4294967295] |
hamming |
binary string with length divisible by 7 |
binary |
space-separated 8-bit binary strings |
char-codes |
space-separated decimal integers |
Attempting to decode invalid input MUST produce an error in the trace, not undefined behavior.
9.3. Character model
The spec assumes strings are sequences of Unicode code points. Implementations in UTF-16 languages (JS, Java) and UTF-8 languages (Rust, Go) MUST agree on output for ASCII input. Non-ASCII behavior is implementation-defined and should be documented per-codec.
9.4. Caesar fixed points
caesar:N where encode = decode (involution) holds for N ≡ 0 and N ≡ 13
(mod 26). N ≡ 0 is the identity (caesar:0, caesar:26, caesar:-26 …).
N ≡ 13 is rot13. The unique non-trivial fixed point among N ∈ {1,…,25}
is N = 13.
9.5. XOR codec representation
xor encode and decode are NOT the same function. Although XOR is
mathematically self-inverse (x XOR k XOR k = x), the codec's encode
and decode operate at different levels:
- Encode: XOR each byte of the input with key
0x42, then represent each result byte as a 2-character lowercase hex string. Input: ASCII string. Output: hex string (twice the length). - Decode: Parse the hex string into byte pairs, XOR each with
0x42, and reconstruct the ASCII string. Input: hex string. Output: ASCII string.
This means xor is a bijection (Pattern 2), not an involution (Pattern 1),
even though the underlying XOR operation is self-inverse. The hex
representation makes encode ≠ decode at the string level.
Reference implementation (Clojure):
(defn- xor-encode [s] ;; "Hi" → XOR each char with 0x42 → emit as hex pairs ;; H=0x48, 0x48 XOR 0x42 = 0x0a → "0a" ;; i=0x69, 0x69 XOR 0x42 = 0x2b → "2b" ;; result: "0a2b" ...) (defn- xor-decode [s] ;; "0a2b" → parse hex pairs → XOR with 0x42 → chars ;; 0x0a XOR 0x42 = 0x48 = H ;; 0x2b XOR 0x42 = 0x69 = i ;; result: "Hi" ...)
Implementations that treat XOR as an involution (same function for both directions, operating on raw bytes) will fail the test vectors because the output representation differs.
9.6. 1337 (leet speak) substitution scope
The substitution table A↔4 E↔3 I↔1 O↔0 S↔5 T↔7 B↔8 G↔6 applies to
uppercase letters only. Lowercase passes through unchanged. This means
case is preserved and the codec is a true bijection on the mixed-case
domain:
encode("BEAST") → "83457"
encode("beast") → "beast" (lowercase untouched)
encode("Beast") → "8east" (only the B is uppercase, so only B→8)
decode("83457") → "BEAST"
decode("beast") → "beast"
decode("8east") → "Beast"
9.7. Hex output case
hex encode MUST produce lowercase hex digits (48656c6c6f, not
48656C6C6F). Decode MUST accept both cases ([0-9a-fA-F]). This is
consistent with the test vector: chain=hex&text=Hello → 48656c6c6f.
9.8. Decode failure convention
When decode receives input outside its domain predicate (e.g., hex:d on
an odd-length string), the codec MUST signal failure. The representation
of failure is language-specific:
| Language | Convention |
|---|---|
| Clojure | throw ex-info (caught by apply-step) |
| TypeScript | throw Error (caught by try/catch) |
| Rust | Result<String, CodecError> |
| OCaml | string option (None on failure) |
| Racket | raise exn:fail:contract |
| Dafny | requires clause prevents invalid call |
The pipeline execution layer MUST catch the failure and produce a trace
entry with error set and output equal to the unchanged input. Pipeline
execution halts after the first error.
9.9. rot13 character ranges
rot13 shifts ASCII letters and passes everything else through unchanged. The exact ranges:
A-M(0x41-0x4D) → shift +13 →N-Z(0x4E-0x5A)N-Z(0x4E-0x5A) → shift -13 →A-M(0x41-0x4D)a-m(0x61-0x6D) → shift +13 →n-z(0x6E-0x7A)n-z(0x6E-0x7A) → shift -13 →a-m(0x61-0x6D)- Everything else: pass through unchanged
Case is preserved. Digits, punctuation, spaces, and non-ASCII are untouched.
9.10. Non-determinism
corrupt is the only non-deterministic codec. Pipelines containing
corrupt violate the purity axiom ("URL → deterministic output").
The spec's axiom applies to all pipelines EXCEPT those containing
corrupt. Implementations SHOULD seed the RNG from the input for
reproducibility in tests, or exclude corrupt from round-trip assertions.
9.11. Surjection terminology
The spec uses "surjection" loosely. Formally, sort is non-injective
(many inputs map to one output) but whether it is surjective depends on
the codomain definition. The spec means: information-destroying,
no left-inverse exists. The UI label ONE-WAY is more precise than the
mathematical term.
10. Non-Functional Requirements
- No server: pure client-side. The URL is the API.
- No framework: vanilla DOM or a compile-to-JS language (ClojureScript, Elm, etc.)
- CSS: monospace font (IBM Plex Mono), circuit-trace aesthetic.
Green (
#059669) = reversible. Amber (#d97706) = warning. Red (#dc2626) = error. #tool-root:min-height: 675pxto prevent layout shift during hydration.- URL sync: typing updates the URL. Loading a URL hydrates the state. No
%2B. - localStorage: save/load named chains. Keys:
transform-chains(JSON object mapping name → chain string),transform-customer-id(UUID v4, created on first visit).
11. URL Parsing Clarification (revised)
+ is space in URL query encoding. This is not overloaded — it is the
same thing in both the chain and text params:
?text=Hello+World→text = "Hello World"← input string?chain=rot13+base64:d→chain = "rot13 base64:d"← space-separated steps
Standard URL parsing (URLSearchParams, url.parse, Ring middleware,
Rack, etc.) decodes + as space. After decoding, the chain value is
a plain space-separated string. Split on whitespace to get step tokens.
Previous guidance (from the TS review on 2026-06-07) said to avoid
URLSearchParams and parse the raw query string manually. That advice
was wrong — it fought the web platform instead of using it. The +
delimiter was always URL-encoded space. Embrace it.
12. Instrumentation
Each of these MUST be observable in the browser console or via the trace:
- Input value at each step boundary
- Whether each step succeeded or errored
- Whether the pipeline is reversible (and which step breaks it if not)
- The serialized chain string matching the URL
- Round-trip identity: for reversible chains,
reverse(forward(x)) = x
13. Spec Review History
| Date | Agent | Language | Tests | Issues | Key finding |
|---|---|---|---|---|---|
| 2026-06-07 | builder | TypeScript | 40/40 | 9 | wrong test vector (VXJ5Yg==) |
| 2026-06-07 | lambda | Dafny | — | 9 | structural ≠ semantic reversibility |
| 2026-06-07 | rust-pro | Rust/WASM | 37/37 | 10 | UTF-8 vs UTF-16 charCodeAt — "spec lie" |
| 2026-06-07 | zero | Racket | 33/33 | 7 | contracts enforce round-trip at boundary, not in tests |
| 2026-06-10 | builder | OCaml | — | 6 | chain parser `:d` per-chain not per-step; XOR as involution not bijection; reverse_pipeline flips involutions |