commit 95bda73440fff880f2fa770cd8bd5aa5c50251ea
parent a5ec87850ea6e1d5c4ab1d34df162966eab9de9a
Author: Jared Tobin <jared@jtobin.io>
Date: Tue, 10 Feb 2026 14:49:24 +0400
plans: add
Diffstat:
4 files changed, 183 insertions(+), 0 deletions(-)
diff --git a/plans/ARCH5.md b/plans/ARCH5.md
@@ -0,0 +1,43 @@
+# ARCH5: Provenance-Aware Auto-Suppression
+
+## Goal
+
+Reduce "unknown base" false positives by automatically proving more
+bases as public, using local provenance and simple stack-slot tracking.
+This is entirely automatic; no manual intervention required.
+
+## Features
+
+1) **Def-use backtrace**
+- Track last definition of each register within a block (and across
+ blocks when taint is known).
+- If a base register is derived from a public root via simple
+ arithmetic/moves, reclassify it as Public.
+
+2) **Stack slot taint**
+- Track `sp + imm` slots for `str/ldr` with constant offsets.
+- If a slot is written with a Public value, then a later load from the
+ same slot yields Public.
+
+3) **GOT/constant pool address patterns**
+- Recognize `adrp` + `ldr [xN, symbol@GOTPAGEOFF]` (and similar)
+ patterns as public address derivations.
+- Mark the destination register as Public.
+
+## Design
+
+- Extend the taint state with a small auxiliary provenance map:
+ - last-def register source (simple ops only)
+ - stack-slot taint map for `[sp, #imm]`
+- Apply these enhancements during taint transfer, so violations see a
+ more precise taint state without a second pass.
+
+## Conservatism
+
+- Only upgrade to Public on explicit, safe patterns.
+- Unknown/Secret never upgrade unless a safe pattern proves it.
+
+## Deliverables
+
+- Fewer Unknown base violations on GHC dumps.
+- Optional `--explain` output that shows the provenance chain.
diff --git a/plans/ARCH6.md b/plans/ARCH6.md
@@ -0,0 +1,42 @@
+# ARCH6: Def-Use Provenance for Base Registers
+
+## Goal
+
+Add lightweight def-use provenance so "Unknown base" can be upgraded to
+Public when the base register is provably derived from public roots via
+simple arithmetic/move chains.
+
+## Scope
+
+- Track only simple, local provenance within a function.
+- No symbolic algebra; only safe, explicit patterns.
+- Inter-proc summaries remain taint-only; provenance is local.
+
+## Provenance Model
+
+- Each register can carry an optional provenance tag:
+ - `ProvRoot r` (public root)
+ - `ProvConst` (adr/adrp/literal)
+ - `ProvDerive r` (derived from another reg via safe op)
+- A provenance chain resolves to Public if it ends in a public root or
+ constant tag.
+
+## Safe Ops
+
+- mov reg, reg
+- add/sub reg, reg, #imm
+- add reg, reg, reg when both are proven public
+- adrp/adr (constant pool)
+- and/or/xor with zero register (preserve provenance)
+
+## Integration
+
+- Extend taint state with a provenance map.
+- When setting taint to Public via provenance, also record provenance.
+- When provenance is lost/unsafe, clear it.
+
+## Reporting
+
+- No output changes by default.
+- Optional explain mode can show provenance chains for suppressed
+ violations.
diff --git a/plans/IMPL5.md b/plans/IMPL5.md
@@ -0,0 +1,51 @@
+# IMPL5: Implement Provenance-Aware Auto-Suppression
+
+## Summary
+
+Implement automatic provenance tracking to reclassify Unknown bases as
+Public when safe patterns are detected (def-use, stack slots, GOT).
+
+## Steps
+
+1) Extend taint state
+- Add `RegProvenance` map (Reg -> simple origin) and
+ `StackSlots` map (Int offset -> Taint).
+- Keep maps minimal: only track cases needed for auto-suppression.
+
+2) Def-use tracking
+- For simple ops (`mov`, `add/sub` with imm, `adr/adrp`, `orr` with
+ zero, etc.), record that dst is derived from a public root.
+- When base reg is Unknown, consult provenance: if provenance chain
+ resolves to Public, upgrade taint.
+
+3) Stack slot tracking
+- On `str/strb/strh/stp` to `[sp, #imm]`, store taint of source in slot.
+- On `ldr/ldrb/ldrh/ldp` from `[sp, #imm]`, restore slot taint into dst.
+- Only handle constant offsets; ignore indexed addressing.
+
+4) GOT/constant pool patterns
+- When seeing `adrp r, sym@GOTPAGE` then `ldr r, [r, sym@GOTPAGEOFF]`,
+ mark `r` Public (and record provenance).
+- Same for `adrp` + `add` + `ldr` patterns as needed.
+
+5) Integrate with inter-proc
+- Ensure provenance and stack-slot maps are per-function analysis state.
+- Preserve summaries as taint-only; do not export provenance across
+ function boundaries.
+
+6) Tests
+- Add fixtures for:
+ - register derived from public root via mov/add
+ - stack spill/reload from `sp, #imm`
+ - GOTPAGE+GOTPAGEOFF pattern
+- Verify violations are suppressed where expected.
+
+## Files to Touch
+
+- `lib/Audit/AArch64/Taint.hs`
+- `lib/Audit/AArch64/Check.hs` (if explanation is emitted)
+- `test/`
+
+## Validation
+
+- Re-run on `etc/Curve.s` and compare violation count.
diff --git a/plans/IMPL6.md b/plans/IMPL6.md
@@ -0,0 +1,47 @@
+# IMPL6: Implement Def-Use Provenance
+
+## Summary
+
+Track simple provenance chains for registers and use them to upgrade
+Unknown bases to Public when derived from public roots or constants.
+
+## Steps
+
+1) Extend taint state
+- Add `tsProv :: Map Reg Provenance` to `TaintState`.
+- Define `Provenance` type (Root/Const/Derive/Unknown).
+
+2) Populate provenance
+- `adr/adrp` -> `ProvConst` + Public taint.
+- `mov dst, src` -> copy provenance from src.
+- `add/sub dst, src, #imm` -> copy provenance from src.
+- `add/sub dst, src1, src2` -> keep provenance only if both proven
+ public and compatible; else clear.
+- `orr/eor/and` with `xzr/wzr` -> preserve provenance.
+- Loads -> clear provenance (unless GOT/stack rule sets Public).
+- Calls -> clear provenance for caller-saved regs (same as taint).
+
+3) Use provenance to upgrade taint
+- When a base reg is Unknown, check provenance chain:
+ if it resolves to public, treat as Public for address checks.
+- Do not upgrade Secret.
+
+4) Stack map interaction
+- When storing to stack, optionally store provenance alongside taint.
+- When loading from stack slot, restore provenance if known.
+
+5) Tests
+- Add fixtures for simple provenance chains:
+ - adrp/add -> base used in ldr (should be public)
+ - mov/add #imm from public root -> base used in ldr
+ - provenance cleared after load from unknown memory
+
+## Files to Touch
+
+- `lib/Audit/AArch64/Taint.hs`
+- `lib/Audit/AArch64/Types.hs` (if new types exposed)
+- `test/`
+
+## Validation
+
+- Re-run on `etc/Curve.s`; expect fewer Unknown base hits.