commit 37685ec46605865baab92ed590bd1e97f9ba85b5
Author: Jared Tobin <jared@jtobin.io>
Date: Tue, 10 Feb 2026 12:06:40 +0400
init: scaffold ppad-auditor project
AArch64 constant-time memory access auditor for GHC+LLVM assembly.
Implements static analysis to verify memory accesses use only public
(non-secret-derived) addresses, helping ensure CT properties for
cryptographic code.
Components:
- Parser: megaparsec-based AArch64 assembly parser
- CFG: basic block and control flow graph construction
- Taint: forward dataflow analysis for register publicness
- Check: memory access validation against taint state
- CLI: command-line interface with JSON output
GHC 9.10.3 AArch64 register map:
- X19: Base, X20: Sp, X21: Hp, X28: SpLim (public roots)
- SP, X29, X18, XZR also treated as public
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Diffstat:
| A | AGENTS.md | | | 143 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | LICENSE | | | 20 | ++++++++++++++++++++ |
| A | app/Main.hs | | | 87 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | flake.nix | | | 56 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | lib/Audit/AArch64.hs | | | 69 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | lib/Audit/AArch64/CFG.hs | | | 124 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | lib/Audit/AArch64/Check.hs | | | 132 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | lib/Audit/AArch64/Parser.hs | | | 463 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | lib/Audit/AArch64/Taint.hs | | | 215 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | lib/Audit/AArch64/Types.hs | | | 271 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | plans/ARCH0.md | | | 101 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | plans/IMPL0.md | | | 67 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | ppad-auditor.cabal | | | 67 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | test/Main.hs | | | 168 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| A | test/fixtures/bad.s | | | 31 | +++++++++++++++++++++++++++++++ |
| A | test/fixtures/good.s | | | 37 | +++++++++++++++++++++++++++++++++++++ |
16 files changed, 2051 insertions(+), 0 deletions(-)
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,143 @@
+# ppad-bolt8
+
+Haskell implementation of BOLT #8 (Lightning Network encrypted transport).
+
+Specification: https://github.com/lightning/bolts/blob/master/08-transport.md
+
+## Project Structure
+
+- `lib/` - library source (Lightning.Protocol.BOLT8)
+- `test/` - tests (tasty + tasty-hunit)
+- `bench/` - benchmarks (criterion for timing, weigh for allocations)
+- `flake.nix` - nix flake for dependency and build management
+- `ppad-bolt8.cabal` - cabal package definition
+- `CLAUDE.md` / `AGENTS.md` - keep these in sync
+
+## Build and Test
+
+Enter devshell and use cabal:
+
+```
+nix develop
+cabal build
+cabal test
+cabal bench
+```
+
+Do not use stack. All dependency and build management via nix.
+
+## Dependencies
+
+### ppad libraries (use freely)
+
+Use ppad libraries (github.com/ppad-tech, git.ppad.tech) liberally.
+Current dependencies: ppad-aead, ppad-hkdf, ppad-secp256k1, ppad-sha256.
+
+### External libraries
+
+Use only minimal external dependencies. Prefer GHC's core/boot libraries
+(base, bytestring, primitive, etc.).
+
+**Ask for explicit confirmation before adding any library outside of:**
+- GHC boot/core libraries
+- ppad-* libraries
+- Test dependencies (tasty, QuickCheck, etc. for test-suite only)
+- Benchmark dependencies (criterion, weigh for benchmark only)
+
+## Code Style
+
+### Performance
+
+- Use strictness annotations (BangPatterns) liberally
+- Prefer UNPACK for strict record fields
+- Use MagicHash, UnboxedTuples, GHC.Exts for hot paths
+- Do not rely on UNBOX pragmas; implement primitives directly with
+ MagicHash and GHC.Exts when needed
+- Use INLINE pragmas for small functions
+- Refer to ppad-sha256 and ppad-fixed for low-level patterns
+
+### Type safety
+
+- Encode invariants into the type system
+- Use newtypes liberally (e.g., Sec, Pub, Session)
+- Use ADTs to make illegal states unrepresentable
+- Prefer smart constructors that validate inputs
+
+### Safety
+
+- Never use partial Prelude functions (head, tail, !!, etc.)
+- Avoid brittle partials in tests too (e.g., unchecked indexing). Prefer
+ bounds checks or total helpers even in test code.
+- Avoid non-exhaustive pattern matches and unsafe behavior; use total
+ helpers and make all constructors explicit.
+- Use Maybe/Either for fallible operations
+- Validate all inputs at system boundaries
+
+### Formatting
+
+- Keep lines under 80 characters
+- Use Haskell2010
+- Module header with copyright, license, maintainer
+- OPTIONS_HADDOCK prune for public modules
+- Haddock examples for exported functions
+
+## Testing
+
+Use tasty to wrap all tests:
+- tasty-hunit for unit tests with known vectors
+- tasty-quickcheck for property-based tests
+- Source test vectors from specifications (RFC, BOLT spec, Wycheproof, etc.)
+
+Property tests should enforce invariants that can't be encoded in types.
+
+## Benchmarking
+
+Always maintain benchmark suites:
+- `bench/Main.hs` - criterion for wall-time benchmarks
+- `bench/Weight.hs` - weigh for allocation tracking
+
+Define NFData instances for types that need benchmarking.
+
+## Git Workflow
+
+- Feature branches for development; commit freely there
+- Logical, atomic commits on feature branches
+- Master should be mostly merge commits
+- Merge to master with `--no-ff` after validation
+- Always build and test before creating a merge commit
+- Write detailed merge commit messages summarising changes
+
+### Worktree flow (for planned work)
+
+When starting work on an implementation plan:
+
+```
+git worktree add ./impl-<desc> -b impl/<desc> master
+# work in that worktree
+# merge to master when complete
+git worktree remove ./impl-<desc>
+```
+
+### Commits
+
+- Higher-level descriptions in merge commits
+- Never update git config
+- Never use destructive git commands (push --force, hard reset) without
+ explicit request
+- Never skip hooks unless explicitly requested
+
+## Planning
+
+When planning work:
+- Highlight which steps can be done independently
+- Consider forking subagents for concurrent work on independent steps
+- Write implementation plans to `plans/IMPL<n>.md` if the project uses
+ this convention
+
+## Flake Structure
+
+The flake.nix follows ppad conventions:
+- Uses ppad-nixpkgs as base
+- Follows references to avoid duplication
+- Supports LLVM backend via cabal flag
+- Provides devShell with ghc, cabal, cc, llvm
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,20 @@
+Copyright (c) 2025 Jared Tobin
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/app/Main.hs b/app/Main.hs
@@ -0,0 +1,87 @@
+{-# LANGUAGE OverloadedStrings #-}
+
+module Main where
+
+import Audit.AArch64
+import Data.Aeson (encode)
+import qualified Data.ByteString.Lazy.Char8 as BL
+import Data.Text (Text)
+import qualified Data.Text as T
+import qualified Data.Text.IO as TIO
+import Options.Applicative
+import System.Exit (exitFailure, exitSuccess)
+
+data Options = Options
+ { optInput :: !FilePath
+ , optJson :: !Bool
+ , optQuiet :: !Bool
+ } deriving (Eq, Show)
+
+optParser :: Parser Options
+optParser = Options
+ <$> strOption
+ ( long "input"
+ <> short 'i'
+ <> metavar "FILE"
+ <> help "Input assembly file (.s)"
+ )
+ <*> switch
+ ( long "json"
+ <> short 'j'
+ <> help "Output JSON format"
+ )
+ <*> switch
+ ( long "quiet"
+ <> short 'q'
+ <> help "Suppress summary, only show violations"
+ )
+
+optInfo :: ParserInfo Options
+optInfo = info (optParser <**> helper)
+ ( fullDesc
+ <> progDesc "Audit AArch64 assembly for constant-time memory access"
+ <> header "auditor - CT memory access auditor for AArch64"
+ )
+
+main :: IO ()
+main = do
+ opts <- execParser optInfo
+ result <- auditFile (optInput opts)
+ case result of
+ Left err -> do
+ TIO.putStrLn $ "Error: " <> err
+ exitFailure
+ Right ar ->
+ if optJson opts
+ then outputJson ar
+ else outputText opts ar
+
+outputJson :: AuditResult -> IO ()
+outputJson ar = BL.putStrLn (encode (arViolations ar))
+
+outputText :: Options -> AuditResult -> IO ()
+outputText opts ar = do
+ let vs = arViolations ar
+ mapM_ printViolation vs
+ if optQuiet opts
+ then pure ()
+ else do
+ TIO.putStrLn ""
+ TIO.putStrLn $ "Lines checked: " <> T.pack (show (arLinesChecked ar))
+ TIO.putStrLn $ "Memory accesses: " <> T.pack (show (arMemoryAccesses ar))
+ TIO.putStrLn $ "Violations: " <> T.pack (show (length vs))
+ if null vs
+ then exitSuccess
+ else exitFailure
+
+printViolation :: Violation -> IO ()
+printViolation v = TIO.putStrLn $
+ vSymbol v <> ":" <> T.pack (show (vLine v)) <> ": " <> reasonText (vReason v)
+
+reasonText :: ViolationReason -> Text
+reasonText r = case r of
+ SecretBase reg -> "secret base register " <> regName reg
+ SecretIndex reg -> "secret index register " <> regName reg
+ UnknownBase reg -> "unknown base register " <> regName reg
+ UnknownIndex reg -> "unknown index register " <> regName reg
+ NonConstOffset -> "non-constant offset without masking"
diff --git a/flake.nix b/flake.nix
@@ -0,0 +1,56 @@
+{
+ description = "AArch64 constant-time memory access auditor";
+
+ inputs = {
+ ppad-nixpkgs = {
+ type = "git";
+ url = "git://git.ppad.tech/nixpkgs.git";
+ ref = "master";
+ };
+ flake-utils.follows = "ppad-nixpkgs/flake-utils";
+ nixpkgs.follows = "ppad-nixpkgs/nixpkgs";
+ };
+
+ outputs = { self, nixpkgs, flake-utils, ppad-nixpkgs }:
+ flake-utils.lib.eachDefaultSystem (system:
+ let
+ lib = "ppad-auditor";
+
+ pkgs = import nixpkgs { inherit system; };
+ hlib = pkgs.haskell.lib;
+ llvm = pkgs.llvmPackages_19.llvm;
+
+ hpkgs = pkgs.haskell.packages.ghc910.extend (new: old: {
+ ${lib} = old.callCabal2nixWithOptions lib ./. "--enable-profiling" {};
+ });
+
+ cc = pkgs.stdenv.cc;
+ ghc = hpkgs.ghc;
+ cabal = hpkgs.cabal-install;
+ in
+ {
+ packages.default = hpkgs.${lib};
+
+ packages.haddock = hpkgs.${lib}.doc;
+
+ devShells.default = hpkgs.shellFor {
+ packages = p: [
+ p.${lib}
+ ];
+
+ buildInputs = [
+ cabal
+ cc
+ llvm
+ ];
+
+ shellHook = ''
+ PS1="[${lib}] \w$ "
+ echo "entering ${system} shell, using"
+ echo "ghc: $(${ghc}/bin/ghc --version)"
+ echo "cabal: $(${cabal}/bin/cabal --version)"
+ '';
+ };
+ }
+ );
+}
diff --git a/lib/Audit/AArch64.hs b/lib/Audit/AArch64.hs
@@ -0,0 +1,69 @@
+{-# OPTIONS_HADDOCK prune #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+-- |
+-- Module: Audit.AArch64
+-- Copyright: (c) 2025 Jared Tobin
+-- License: MIT
+-- Maintainer: jared@ppad.tech
+--
+-- AArch64 constant-time memory access auditor.
+--
+-- This module provides static analysis for AArch64 GHC+LLVM assembly
+-- to verify that memory accesses use only public (non-secret-derived)
+-- addresses. This helps ensure constant-time properties for
+-- cryptographic code.
+--
+-- Example usage:
+--
+-- @
+-- import Audit.AArch64
+-- import qualified Data.Text.IO as TIO
+--
+-- main = do
+-- src <- TIO.readFile "foo.s"
+-- case audit "foo.s" src of
+-- Left err -> putStrLn $ "Parse error: " ++ show err
+-- Right result -> print result
+-- @
+
+module Audit.AArch64 (
+ -- * Main API
+ audit
+ , auditFile
+
+ -- * Results
+ , AuditResult(..)
+ , Violation(..)
+ , ViolationReason(..)
+
+ -- * Re-exports
+ , ParseError
+ ) where
+
+import Audit.AArch64.CFG
+import Audit.AArch64.Check
+import Audit.AArch64.Parser
+import Audit.AArch64.Types
+import qualified Data.ByteString as BS
+import Data.Text (Text)
+import qualified Data.Text as T
+import Data.Text.Encoding (decodeUtf8')
+
+-- | Audit assembly source for memory access violations.
+audit :: Text -> Text -> Either ParseError AuditResult
+audit name src = do
+ lns <- parseAsm src
+ let cfg = buildCFG lns
+ pure (checkCFG name cfg)
+
+-- | Audit an assembly file.
+auditFile :: FilePath -> IO (Either Text AuditResult)
+auditFile path = do
+ bs <- BS.readFile path
+ case decodeUtf8' bs of
+ Left err -> pure (Left (T.pack (show err)))
+ Right src ->
+ case audit (T.pack path) src of
+ Left err -> pure (Left (T.pack (show err)))
+ Right result -> pure (Right result)
diff --git a/lib/Audit/AArch64/CFG.hs b/lib/Audit/AArch64/CFG.hs
@@ -0,0 +1,124 @@
+{-# OPTIONS_HADDOCK prune #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+-- |
+-- Module: Audit.AArch64.CFG
+-- Copyright: (c) 2025 Jared Tobin
+-- License: MIT
+-- Maintainer: jared@ppad.tech
+--
+-- Control flow graph construction for AArch64 assembly.
+
+module Audit.AArch64.CFG (
+ BasicBlock(..)
+ , CFG(..)
+ , buildCFG
+ , blockLabels
+ ) where
+
+import Audit.AArch64.Types
+import Data.Map.Strict (Map)
+import qualified Data.Map.Strict as Map
+import Data.Set (Set)
+import qualified Data.Set as Set
+import Data.Text (Text)
+
+-- | A basic block: a sequence of instructions with a single entry point.
+data BasicBlock = BasicBlock
+ { bbLabel :: !(Maybe Text) -- ^ Optional label at block start
+ , bbLines :: ![Line] -- ^ Instructions in the block
+ , bbSuccs :: ![Text] -- ^ Successor block labels
+ } deriving (Eq, Show)
+
+-- | Control flow graph.
+data CFG = CFG
+ { cfgBlocks :: ![BasicBlock] -- ^ All basic blocks
+ , cfgLabelMap :: !(Map Text Int) -- ^ Label -> block index
+ , cfgEntry :: !Int -- ^ Entry block index
+ } deriving (Eq, Show)
+
+-- | Get all labels defined in the CFG.
+blockLabels :: CFG -> Set Text
+blockLabels cfg = Set.fromList
+ [ lbl | bb <- cfgBlocks cfg, Just lbl <- [bbLabel bb] ]
+
+-- | Build a CFG from parsed assembly lines.
+buildCFG :: [Line] -> CFG
+buildCFG lns = CFG
+ { cfgBlocks = blocks
+ , cfgLabelMap = labelMap
+ , cfgEntry = 0
+ }
+ where
+ -- Split into basic blocks at labels and control flow instructions
+ blocks = buildBlocks lns
+ labelMap = Map.fromList
+ [ (lbl, idx)
+ | (idx, bb) <- zip [0..] blocks
+ , Just lbl <- [bbLabel bb]
+ ]
+
+-- | Build basic blocks from lines.
+buildBlocks :: [Line] -> [BasicBlock]
+buildBlocks [] = []
+buildBlocks lns = go [] Nothing lns
+ where
+ go acc mLabel [] =
+ if null acc
+ then []
+ else [finishBlock mLabel (reverse acc)]
+
+ go acc mLabel (l:ls) =
+ case (lineLabel l, lineInstr l) of
+ -- New label starts a new block
+ (Just lbl, _) ->
+ let current = if null acc then [] else [finishBlock mLabel (reverse acc)]
+ in current ++ go [l] (Just lbl) ls
+
+ -- Control flow instruction ends a block
+ (Nothing, Just instr) | isTerminator instr ->
+ let block = finishBlock mLabel (reverse (l:acc))
+ in block : go [] Nothing ls
+
+ -- Regular instruction
+ _ -> go (l:acc) mLabel ls
+
+ finishBlock mLabel lns' = BasicBlock
+ { bbLabel = mLabel
+ , bbLines = lns'
+ , bbSuccs = successorLabels (lastInstr lns')
+ }
+
+ lastInstr [] = Nothing
+ lastInstr xs = lineInstr (last xs)
+
+-- | Check if an instruction is a control flow terminator.
+isTerminator :: Instr -> Bool
+isTerminator instr = case instr of
+ B _ -> True
+ BCond _ _ -> True
+ Bl _ -> True
+ Blr _ -> True
+ Br _ -> True
+ Ret _ -> True
+ Cbz _ _ -> True
+ Cbnz _ _ -> True
+ Tbz _ _ _ -> True
+ Tbnz _ _ _ -> True
+ _ -> False
+
+-- | Extract successor labels from a terminating instruction.
+successorLabels :: Maybe Instr -> [Text]
+successorLabels Nothing = []
+successorLabels (Just instr) = case instr of
+ B lbl -> [lbl]
+ BCond _ lbl -> [lbl] -- Conditional also falls through
+ Bl lbl -> [lbl] -- Call returns, so also falls through
+ Blr _ -> [] -- Indirect call
+ Br _ -> [] -- Indirect branch
+ Ret _ -> [] -- Return
+ Cbz _ lbl -> [lbl] -- Plus fallthrough
+ Cbnz _ lbl -> [lbl] -- Plus fallthrough
+ Tbz _ _ lbl -> [lbl] -- Plus fallthrough
+ Tbnz _ _ lbl -> [lbl] -- Plus fallthrough
+ _ -> []
diff --git a/lib/Audit/AArch64/Check.hs b/lib/Audit/AArch64/Check.hs
@@ -0,0 +1,132 @@
+{-# OPTIONS_HADDOCK prune #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+-- |
+-- Module: Audit.AArch64.Check
+-- Copyright: (c) 2025 Jared Tobin
+-- License: MIT
+-- Maintainer: jared@ppad.tech
+--
+-- Memory access validation for constant-time properties.
+--
+-- Checks that all memory accesses use public base registers and
+-- constant (or properly masked) offsets.
+
+module Audit.AArch64.Check (
+ checkLine
+ , checkBlock
+ , checkCFG
+ , AuditResult(..)
+ ) where
+
+import Audit.AArch64.CFG
+import Audit.AArch64.Taint
+import Audit.AArch64.Types
+import Data.Text (Text)
+import qualified Data.Text as T
+
+-- | Result of auditing assembly.
+data AuditResult = AuditResult
+ { arViolations :: ![Violation]
+ , arLinesChecked :: !Int
+ , arMemoryAccesses :: !Int
+ } deriving (Eq, Show)
+
+instance Semigroup AuditResult where
+ a <> b = AuditResult
+ { arViolations = arViolations a ++ arViolations b
+ , arLinesChecked = arLinesChecked a + arLinesChecked b
+ , arMemoryAccesses = arMemoryAccesses a + arMemoryAccesses b
+ }
+
+instance Monoid AuditResult where
+ mempty = AuditResult [] 0 0
+
+-- | Check a single line for memory access violations.
+checkLine :: Text -> TaintState -> Line -> AuditResult
+checkLine sym st l = case lineInstr l of
+ Nothing -> AuditResult [] 1 0
+ Just instr -> case getMemoryAccess instr of
+ Nothing -> AuditResult [] 1 0
+ Just addr ->
+ let violations = checkAddrMode sym (lineNum l) instr addr st
+ in AuditResult violations 1 1
+
+-- | Check a basic block, threading taint state.
+checkBlock :: Text -> TaintState -> [Line] -> (AuditResult, TaintState)
+checkBlock sym st0 lns = go mempty st0 lns
+ where
+ go acc st [] = (acc, st)
+ go acc st (l:ls) =
+ let result = checkLine sym st l
+ st' = analyzeLine l st
+ in go (acc <> result) st' ls
+
+-- | Check entire CFG.
+-- Uses a simple forward pass (no fixpoint iteration for now).
+checkCFG :: Text -> CFG -> AuditResult
+checkCFG sym cfg = mconcat
+ [ fst (checkBlock blockSym initTaintState (bbLines bb))
+ | bb <- cfgBlocks cfg
+ , let blockSym = maybe sym id (bbLabel bb)
+ ]
+
+-- | Extract memory address from a load/store instruction.
+getMemoryAccess :: Instr -> Maybe AddrMode
+getMemoryAccess instr = case instr of
+ Ldr _ addr -> Just addr
+ Ldrb _ addr -> Just addr
+ Ldrh _ addr -> Just addr
+ Ldrsb _ addr -> Just addr
+ Ldrsh _ addr -> Just addr
+ Ldrsw _ addr -> Just addr
+ Str _ addr -> Just addr
+ Strb _ addr -> Just addr
+ Strh _ addr -> Just addr
+ Ldp _ _ addr -> Just addr
+ Stp _ _ addr -> Just addr
+ _ -> Nothing
+
+-- | Check an addressing mode for violations.
+checkAddrMode :: Text -> Int -> Instr -> AddrMode -> TaintState -> [Violation]
+checkAddrMode sym ln instr addr st = case addr of
+ BaseImm base _ ->
+ checkBase sym ln instr base st
+
+ BaseReg base idx ->
+ checkBase sym ln instr base st ++
+ checkIndex sym ln instr idx st
+
+ BaseRegShift base idx _ ->
+ checkBase sym ln instr base st ++
+ checkIndex sym ln instr idx st
+
+ BaseRegExtend base idx _ ->
+ checkBase sym ln instr base st ++
+ checkIndex sym ln instr idx st
+
+ PreIndex base _ ->
+ checkBase sym ln instr base st
+
+ PostIndex base _ ->
+ checkBase sym ln instr base st
+
+ Literal _ ->
+ -- PC-relative literal: public
+ []
+
+-- | Check that base register is public.
+checkBase :: Text -> Int -> Instr -> Reg -> TaintState -> [Violation]
+checkBase sym ln instr base st =
+ case getTaint base st of
+ Public -> []
+ Secret -> [Violation sym ln instr (SecretBase base)]
+ Unknown -> [Violation sym ln instr (UnknownBase base)]
+
+-- | Check that index register is public.
+checkIndex :: Text -> Int -> Instr -> Reg -> TaintState -> [Violation]
+checkIndex sym ln instr idx st =
+ case getTaint idx st of
+ Public -> []
+ Secret -> [Violation sym ln instr (SecretIndex idx)]
+ Unknown -> [Violation sym ln instr (UnknownIndex idx)]
diff --git a/lib/Audit/AArch64/Parser.hs b/lib/Audit/AArch64/Parser.hs
@@ -0,0 +1,463 @@
+{-# OPTIONS_HADDOCK prune #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+-- |
+-- Module: Audit.AArch64.Parser
+-- Copyright: (c) 2025 Jared Tobin
+-- License: MIT
+-- Maintainer: jared@ppad.tech
+--
+-- Parser for AArch64 assembly.
+
+module Audit.AArch64.Parser (
+ parseAsm
+ , ParseError
+ ) where
+
+import Audit.AArch64.Types
+import Control.Monad (void)
+import Data.Char (isAlphaNum, isDigit, isSpace)
+import Data.Text (Text)
+import qualified Data.Text as T
+import Data.Void (Void)
+import Text.Megaparsec
+import Text.Megaparsec.Char
+import qualified Text.Megaparsec.Char.Lexer as L
+
+type Parser = Parsec Void Text
+type ParseError = ParseErrorBundle Text Void
+
+-- | Parse assembly source into a list of lines.
+parseAsm :: Text -> Either ParseError [Line]
+parseAsm = parse (many pLine <* eof) "<asm>"
+
+-- Space consumer (skips whitespace and comments).
+sc :: Parser ()
+sc = L.space
+ space1
+ (L.skipLineComment "//")
+ (L.skipBlockComment "/*" "*/")
+
+-- Lexeme wrapper.
+lexeme :: Parser a -> Parser a
+lexeme = L.lexeme sc
+
+-- Symbol (alphanumeric + underscore + dot + dollar).
+symbol :: Text -> Parser Text
+symbol = L.symbol sc
+
+-- Parse a single line.
+pLine :: Parser Line
+pLine = do
+ pos <- getSourcePos
+ let ln = unPos (sourceLine pos)
+ skipMany (satisfy isSpace)
+ choice
+ [ pDirectiveLine ln
+ , pLabelOrInstrLine ln
+ , pEmptyLine ln
+ ]
+
+pEmptyLine :: Int -> Parser Line
+pEmptyLine ln = do
+ void (optional pComment)
+ void (optional eol)
+ pure (Line ln Nothing Nothing)
+
+pComment :: Parser ()
+pComment = do
+ void (choice [string "//", string ";", string "@"])
+ void (takeWhileP Nothing (/= '\n'))
+
+pDirectiveLine :: Int -> Parser Line
+pDirectiveLine ln = do
+ void (char '.')
+ void (takeWhileP Nothing (\c -> c /= '\n'))
+ void (optional eol)
+ pure (Line ln Nothing Nothing)
+
+pLabelOrInstrLine :: Int -> Parser Line
+pLabelOrInstrLine ln = do
+ mLabel <- optional (try pLabel)
+ skipMany (satisfy isSpace)
+ mInstr <- optional pInstr
+ skipMany (satisfy isSpace)
+ void (optional pComment)
+ void (optional eol)
+ pure (Line ln mLabel mInstr)
+
+pLabel :: Parser Text
+pLabel = lexeme $ do
+ name <- pIdentifier
+ void (char ':')
+ pure name
+
+pIdentifier :: Parser Text
+pIdentifier = do
+ c <- satisfy isIdentStart
+ cs <- takeWhileP Nothing isIdentChar
+ pure (T.cons c cs)
+ where
+ isIdentStart c = c == '_' || c == '.' || c == 'L' ||
+ (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')
+ isIdentChar c = isAlphaNum c || c == '_' || c == '.' || c == '$'
+
+pInstr :: Parser Instr
+pInstr = do
+ mnem <- lexeme pMnemonic
+ parseByMnemonic mnem
+
+pMnemonic :: Parser Text
+pMnemonic = do
+ cs <- takeWhile1P Nothing isIdentChar
+ pure (T.toLower cs)
+ where
+ isIdentChar c = isAlphaNum c || c == '.'
+
+parseByMnemonic :: Text -> Parser Instr
+parseByMnemonic m = case m of
+ -- Data movement
+ "mov" -> Mov <$> pReg <*> (pComma *> pOperand)
+ "movz" -> Movz <$> pReg <*> (pComma *> pImm) <*> optional (pComma *> pShift)
+ "movk" -> Movk <$> pReg <*> (pComma *> pImm) <*> optional (pComma *> pShift)
+ "movn" -> Movn <$> pReg <*> (pComma *> pImm) <*> optional (pComma *> pShift)
+
+ -- Arithmetic
+ "add" -> Add <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "sub" -> Sub <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "adds" -> Adds <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "subs" -> Subs <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "adc" -> Adc <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ "sbc" -> Sbc <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ "neg" -> Neg <$> pReg <*> (pComma *> pOperand)
+ "mul" -> Mul <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ "madd" -> Madd <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ <*> (pComma *> pReg)
+ "msub" -> Msub <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ <*> (pComma *> pReg)
+ "umulh" -> Umulh <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ "smulh" -> Smulh <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ "udiv" -> Udiv <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ "sdiv" -> Sdiv <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+
+ -- Logical
+ "and" -> And <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "orr" -> Orr <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "eor" -> Eor <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "bic" -> Bic <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "mvn" -> Mvn <$> pReg <*> (pComma *> pOperand)
+ "tst" -> Tst <$> pReg <*> (pComma *> pOperand)
+
+ -- Shifts
+ "lsl" -> Lsl <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "lsr" -> Lsr <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "asr" -> Asr <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+ "ror" -> Ror <$> pReg <*> (pComma *> pReg) <*> (pComma *> pOperand)
+
+ -- Bit manipulation
+ "ubfx" -> Ubfx <$> pReg <*> (pComma *> pReg) <*> (pComma *> pInt)
+ <*> (pComma *> pInt)
+ "sbfx" -> Sbfx <$> pReg <*> (pComma *> pReg) <*> (pComma *> pInt)
+ <*> (pComma *> pInt)
+ "bfi" -> Bfi <$> pReg <*> (pComma *> pReg) <*> (pComma *> pInt)
+ <*> (pComma *> pInt)
+ "extr" -> Extr <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ <*> (pComma *> pInt)
+
+ -- Address generation
+ "adr" -> Adr <$> pReg <*> (pComma *> pSymbolRef)
+ "adrp" -> Adrp <$> pReg <*> (pComma *> pSymbolRef)
+
+ -- Load/Store
+ "ldr" -> Ldr <$> pReg <*> (pComma *> pAddrMode)
+ "ldrb" -> Ldrb <$> pReg <*> (pComma *> pAddrMode)
+ "ldrh" -> Ldrh <$> pReg <*> (pComma *> pAddrMode)
+ "ldrsb" -> Ldrsb <$> pReg <*> (pComma *> pAddrMode)
+ "ldrsh" -> Ldrsh <$> pReg <*> (pComma *> pAddrMode)
+ "ldrsw" -> Ldrsw <$> pReg <*> (pComma *> pAddrMode)
+ "str" -> Str <$> pReg <*> (pComma *> pAddrMode)
+ "strb" -> Strb <$> pReg <*> (pComma *> pAddrMode)
+ "strh" -> Strh <$> pReg <*> (pComma *> pAddrMode)
+ "ldp" -> Ldp <$> pReg <*> (pComma *> pReg) <*> (pComma *> pAddrMode)
+ "stp" -> Stp <$> pReg <*> (pComma *> pReg) <*> (pComma *> pAddrMode)
+
+ -- Compare and select
+ "cmp" -> Cmp <$> pReg <*> (pComma *> pOperand)
+ "cmn" -> Cmn <$> pReg <*> (pComma *> pOperand)
+ "csel" -> Csel <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ <*> (pComma *> pCondCode)
+ "csinc" -> Csinc <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ <*> (pComma *> pCondCode)
+ "csinv" -> Csinv <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ <*> (pComma *> pCondCode)
+ "csneg" -> Csneg <$> pReg <*> (pComma *> pReg) <*> (pComma *> pReg)
+ <*> (pComma *> pCondCode)
+ "cset" -> Cset <$> pReg <*> (pComma *> pCondCode)
+ "cinc" -> Cinc <$> pReg <*> (pComma *> pReg) <*> (pComma *> pCondCode)
+
+ -- Branches
+ "b" -> B <$> pSymbolRef
+ "b.eq" -> BCond "eq" <$> pSymbolRef
+ "b.ne" -> BCond "ne" <$> pSymbolRef
+ "b.lt" -> BCond "lt" <$> pSymbolRef
+ "b.le" -> BCond "le" <$> pSymbolRef
+ "b.gt" -> BCond "gt" <$> pSymbolRef
+ "b.ge" -> BCond "ge" <$> pSymbolRef
+ "b.hi" -> BCond "hi" <$> pSymbolRef
+ "b.hs" -> BCond "hs" <$> pSymbolRef
+ "b.lo" -> BCond "lo" <$> pSymbolRef
+ "b.ls" -> BCond "ls" <$> pSymbolRef
+ "b.mi" -> BCond "mi" <$> pSymbolRef
+ "b.pl" -> BCond "pl" <$> pSymbolRef
+ "b.vs" -> BCond "vs" <$> pSymbolRef
+ "b.vc" -> BCond "vc" <$> pSymbolRef
+ "b.cc" -> BCond "cc" <$> pSymbolRef
+ "b.cs" -> BCond "cs" <$> pSymbolRef
+ "bl" -> Bl <$> pSymbolRef
+ "blr" -> Blr <$> pReg
+ "br" -> Br <$> pReg
+ "ret" -> Ret <$> optional pReg
+ "cbz" -> Cbz <$> pReg <*> (pComma *> pSymbolRef)
+ "cbnz" -> Cbnz <$> pReg <*> (pComma *> pSymbolRef)
+ "tbz" -> Tbz <$> pReg <*> (pComma *> pInt) <*> (pComma *> pSymbolRef)
+ "tbnz" -> Tbnz <$> pReg <*> (pComma *> pInt) <*> (pComma *> pSymbolRef)
+
+ -- System
+ "nop" -> pure Nop
+ "svc" -> Svc <$> pImm
+
+ -- Fallback: store as Other
+ _ -> Other m <$> pRemainingOperands
+
+pRemainingOperands :: Parser [Text]
+pRemainingOperands = do
+ rest <- takeWhileP Nothing (\c -> c /= '\n' && c /= ';' && c /= '/')
+ let trimmed = T.strip rest
+ if T.null trimmed
+ then pure []
+ else pure (map T.strip (T.splitOn "," trimmed))
+
+pComma :: Parser ()
+pComma = void (lexeme (char ','))
+
+pReg :: Parser Reg
+pReg = lexeme $ choice
+ [ pXReg, pWReg, pDReg, pSReg, pQReg, pSpecialReg ]
+
+pXReg :: Parser Reg
+pXReg = do
+ void (char' 'x')
+ n <- L.decimal
+ case n :: Int of
+ 0 -> pure X0; 1 -> pure X1; 2 -> pure X2; 3 -> pure X3
+ 4 -> pure X4; 5 -> pure X5; 6 -> pure X6; 7 -> pure X7
+ 8 -> pure X8; 9 -> pure X9; 10 -> pure X10; 11 -> pure X11
+ 12 -> pure X12; 13 -> pure X13; 14 -> pure X14; 15 -> pure X15
+ 16 -> pure X16; 17 -> pure X17; 18 -> pure X18; 19 -> pure X19
+ 20 -> pure X20; 21 -> pure X21; 22 -> pure X22; 23 -> pure X23
+ 24 -> pure X24; 25 -> pure X25; 26 -> pure X26; 27 -> pure X27
+ 28 -> pure X28; 29 -> pure X29; 30 -> pure X30
+ _ -> fail "invalid x register number"
+
+pWReg :: Parser Reg
+pWReg = do
+ void (char' 'w')
+ n <- L.decimal
+ case n :: Int of
+ 0 -> pure W0; 1 -> pure W1; 2 -> pure W2; 3 -> pure W3
+ 4 -> pure W4; 5 -> pure W5; 6 -> pure W6; 7 -> pure W7
+ 8 -> pure W8; 9 -> pure W9; 10 -> pure W10; 11 -> pure W11
+ 12 -> pure W12; 13 -> pure W13; 14 -> pure W14; 15 -> pure W15
+ 16 -> pure W16; 17 -> pure W17; 18 -> pure W18; 19 -> pure W19
+ 20 -> pure W20; 21 -> pure W21; 22 -> pure W22; 23 -> pure W23
+ 24 -> pure W24; 25 -> pure W25; 26 -> pure W26; 27 -> pure W27
+ 28 -> pure W28; 29 -> pure W29; 30 -> pure W30
+ _ -> fail "invalid w register number"
+
+pDReg :: Parser Reg
+pDReg = do
+ void (char' 'd')
+ n <- L.decimal
+ case n :: Int of
+ 0 -> pure D0; 1 -> pure D1; 2 -> pure D2; 3 -> pure D3
+ 4 -> pure D4; 5 -> pure D5; 6 -> pure D6; 7 -> pure D7
+ 8 -> pure D8; 9 -> pure D9; 10 -> pure D10; 11 -> pure D11
+ 12 -> pure D12; 13 -> pure D13; 14 -> pure D14; 15 -> pure D15
+ 16 -> pure D16; 17 -> pure D17; 18 -> pure D18; 19 -> pure D19
+ 20 -> pure D20; 21 -> pure D21; 22 -> pure D22; 23 -> pure D23
+ 24 -> pure D24; 25 -> pure D25; 26 -> pure D26; 27 -> pure D27
+ 28 -> pure D28; 29 -> pure D29; 30 -> pure D30; 31 -> pure D31
+ _ -> fail "invalid d register number"
+
+pSReg :: Parser Reg
+pSReg = do
+ void (char' 's')
+ n <- L.decimal
+ case n :: Int of
+ 0 -> pure S0; 1 -> pure S1; 2 -> pure S2; 3 -> pure S3
+ 4 -> pure S4; 5 -> pure S5; 6 -> pure S6; 7 -> pure S7
+ 8 -> pure S8; 9 -> pure S9; 10 -> pure S10; 11 -> pure S11
+ 12 -> pure S12; 13 -> pure S13; 14 -> pure S14; 15 -> pure S15
+ 16 -> pure S16; 17 -> pure S17; 18 -> pure S18; 19 -> pure S19
+ 20 -> pure S20; 21 -> pure S21; 22 -> pure S22; 23 -> pure S23
+ 24 -> pure S24; 25 -> pure S25; 26 -> pure S26; 27 -> pure S27
+ 28 -> pure S28; 29 -> pure S29; 30 -> pure S30; 31 -> pure S31
+ _ -> fail "invalid s register number"
+
+pQReg :: Parser Reg
+pQReg = do
+ void (char' 'q')
+ n <- L.decimal
+ case n :: Int of
+ 0 -> pure Q0; 1 -> pure Q1; 2 -> pure Q2; 3 -> pure Q3
+ 4 -> pure Q4; 5 -> pure Q5; 6 -> pure Q6; 7 -> pure Q7
+ 8 -> pure Q8; 9 -> pure Q9; 10 -> pure Q10; 11 -> pure Q11
+ 12 -> pure Q12; 13 -> pure Q13; 14 -> pure Q14; 15 -> pure Q15
+ 16 -> pure Q16; 17 -> pure Q17; 18 -> pure Q18; 19 -> pure Q19
+ 20 -> pure Q20; 21 -> pure Q21; 22 -> pure Q22; 23 -> pure Q23
+ 24 -> pure Q24; 25 -> pure Q25; 26 -> pure Q26; 27 -> pure Q27
+ 28 -> pure Q28; 29 -> pure Q29; 30 -> pure Q30; 31 -> pure Q31
+ _ -> fail "invalid q register number"
+
+pSpecialReg :: Parser Reg
+pSpecialReg = choice
+ [ SP <$ string' "sp"
+ , XZR <$ string' "xzr"
+ , WZR <$ string' "wzr"
+ , X29 <$ string' "fp" -- fp is alias for x29
+ , X30 <$ string' "lr" -- lr is alias for x30
+ ]
+
+pOperand :: Parser Operand
+pOperand = choice
+ [ try (OpAddr <$> pAddrMode)
+ , try pShiftedRegOp
+ , OpImm <$> pImm
+ , OpReg <$> pReg
+ , OpLabel <$> pSymbolRef
+ ]
+
+pShiftedRegOp :: Parser Operand
+pShiftedRegOp = do
+ r <- pReg
+ pComma
+ sh <- pShift
+ pure (OpShiftedReg r sh)
+
+pImm :: Parser Integer
+pImm = lexeme $ do
+ void (optional (char '#'))
+ signed L.decimal
+
+signed :: Parser Integer -> Parser Integer
+signed p = do
+ neg <- optional (char '-')
+ n <- p
+ case neg of
+ Just _ -> pure (negate n)
+ Nothing -> pure n
+
+pInt :: Parser Int
+pInt = lexeme $ do
+ void (optional (char '#'))
+ fromInteger <$> signed L.decimal
+
+pShift :: Parser Shift
+pShift = lexeme $ choice
+ [ LSL <$> (string' "lsl" *> sc *> pShiftAmt)
+ , LSR <$> (string' "lsr" *> sc *> pShiftAmt)
+ , ASR <$> (string' "asr" *> sc *> pShiftAmt)
+ ]
+
+pShiftAmt :: Parser Int
+pShiftAmt = do
+ void (optional (char '#'))
+ fromInteger <$> L.decimal
+
+pSymbolRef :: Parser Text
+pSymbolRef = lexeme $ do
+ -- Handle =literal, @PAGE, @PAGEOFF suffixes, etc.
+ prefix <- optional (char '=')
+ name <- pIdentifier
+ suffix <- optional pSymbolSuffix
+ let base = maybe name (T.cons '=') prefix <> name
+ pure (maybe base (base <>) suffix)
+
+pSymbolSuffix :: Parser Text
+pSymbolSuffix = do
+ void (char '@')
+ suf <- takeWhile1P Nothing isAlphaNum
+ pure (T.cons '@' suf)
+
+pCondCode :: Parser Text
+pCondCode = lexeme $ takeWhile1P Nothing isAlphaNum
+
+pAddrMode :: Parser AddrMode
+pAddrMode = choice
+ [ try pLiteralAddr
+ , pBracketAddr
+ ]
+
+pLiteralAddr :: Parser AddrMode
+pLiteralAddr = Literal <$> (char '=' *> pIdentifier)
+
+pBracketAddr :: Parser AddrMode
+pBracketAddr = lexeme $ do
+ void (char '[')
+ sc
+ base <- pReg
+ sc
+ mode <- optional (pComma *> pAddrModeRest base)
+ sc
+ void (char ']')
+ -- Check for pre/post index
+ postIdx <- optional pPostIndex
+ case (mode, postIdx) of
+ (Nothing, Nothing) -> pure (BaseImm base 0)
+ (Nothing, Just imm) -> pure (PostIndex base imm)
+ (Just (Left imm, True), Nothing) -> pure (PreIndex base imm)
+ (Just (Left imm, False), Nothing) -> pure (BaseImm base imm)
+ (Just (Right (r, Nothing), _), Nothing) -> pure (BaseReg base r)
+ (Just (Right (r, Just (Left sh)), _), Nothing) -> pure (BaseRegShift base r sh)
+ (Just (Right (r, Just (Right ext)), _), Nothing) -> pure (BaseRegExtend base r ext)
+ _ -> fail "invalid addressing mode"
+
+pAddrModeRest :: Reg -> Parser (Either Integer (Reg, Maybe (Either Shift Extend)), Bool)
+pAddrModeRest _ = do
+ choice
+ [ try pImmOffset
+ , pRegOffset
+ ]
+
+pImmOffset :: Parser (Either Integer (Reg, Maybe (Either Shift Extend)), Bool)
+pImmOffset = do
+ imm <- pImm
+ preIdx <- optional (char '!')
+ pure (Left imm, maybe False (const True) preIdx)
+
+pRegOffset :: Parser (Either Integer (Reg, Maybe (Either Shift Extend)), Bool)
+pRegOffset = do
+ r <- pReg
+ modifier <- optional (pComma *> pShiftOrExtend)
+ pure (Right (r, modifier), False)
+
+pShiftOrExtend :: Parser (Either Shift Extend)
+pShiftOrExtend = choice
+ [ Left <$> pShift
+ , Right <$> pExtend
+ ]
+
+pExtend :: Parser Extend
+pExtend = lexeme $ choice
+ [ UXTW <$ string' "uxtw"
+ , SXTW <$ string' "sxtw"
+ , UXTX <$ string' "uxtx"
+ , SXTX <$ string' "sxtx"
+ ]
+
+pPostIndex :: Parser Integer
+pPostIndex = do
+ sc
+ void (char ',')
+ sc
+ pImm
diff --git a/lib/Audit/AArch64/Taint.hs b/lib/Audit/AArch64/Taint.hs
@@ -0,0 +1,215 @@
+{-# OPTIONS_HADDOCK prune #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+-- |
+-- Module: Audit.AArch64.Taint
+-- Copyright: (c) 2025 Jared Tobin
+-- License: MIT
+-- Maintainer: jared@ppad.tech
+--
+-- Taint analysis for AArch64 registers.
+--
+-- Implements forward dataflow analysis to track register "publicness".
+-- Public registers are those derived from known-safe sources (stack
+-- pointers, heap pointers, constants). Registers with unknown or
+-- secret-derived values are flagged when used in memory addressing.
+
+module Audit.AArch64.Taint (
+ TaintState
+ , initTaintState
+ , analyzeLine
+ , analyzeBlock
+ , getTaint
+ , publicRoots
+ ) where
+
+import Audit.AArch64.Types
+import Data.Map.Strict (Map)
+import qualified Data.Map.Strict as Map
+
+-- | Taint state: maps registers to their publicness.
+type TaintState = Map Reg Taint
+
+-- | GHC 9.10.3 AArch64 public root registers.
+--
+-- These are known to be public (derived from stack/heap pointers or
+-- constants):
+--
+-- - SP: hardware stack pointer
+-- - X29/FP: frame pointer
+-- - X19: GHC Base register
+-- - X20: GHC Sp (STG stack pointer)
+-- - X21: GHC Hp (heap pointer)
+-- - X28: GHC SpLim (stack limit)
+-- - X18: TLS/platform register (Darwin)
+-- - XZR/WZR: zero registers
+publicRoots :: [Reg]
+publicRoots =
+ [ SP, X29 -- Hardware stack/frame pointers
+ , X19 -- GHC Base
+ , X20 -- GHC Sp (STG stack pointer)
+ , X21 -- GHC Hp (heap pointer)
+ , X28 -- GHC SpLim
+ , X18 -- TLS (Darwin platform register)
+ , XZR, WZR -- Zero registers
+ ]
+
+-- | Initial taint state with public roots marked.
+initTaintState :: TaintState
+initTaintState = Map.fromList [(r, Public) | r <- publicRoots]
+
+-- | Get the taint of a register.
+getTaint :: Reg -> TaintState -> Taint
+getTaint r st = Map.findWithDefault Unknown r st
+
+-- | Analyze a single line, updating taint state.
+analyzeLine :: Line -> TaintState -> TaintState
+analyzeLine l st = case lineInstr l of
+ Nothing -> st
+ Just instr -> transfer instr st
+
+-- | Analyze a basic block, threading taint state through.
+analyzeBlock :: [Line] -> TaintState -> TaintState
+analyzeBlock lns st = foldl (flip analyzeLine) st lns
+
+-- | Transfer function for taint analysis.
+--
+-- For each instruction, determine how it affects register taints.
+transfer :: Instr -> TaintState -> TaintState
+transfer instr st = case instr of
+ -- Move: destination gets source taint
+ Mov dst op -> setTaint dst (operandTaint op st) st
+ Movz dst _ _ -> setTaint dst Public st -- Immediate is public
+ Movk dst _ _ -> st -- Keeps existing value, modifies bits
+ Movn dst _ _ -> setTaint dst Public st -- Immediate is public
+
+ -- Arithmetic: result is join of operand taints
+ Add dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Sub dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Adds dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Subs dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Adc dst r1 r2 -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Sbc dst r1 r2 -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Neg dst op -> setTaint dst (operandTaint op st) st
+ Mul dst r1 r2 -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Madd dst r1 r2 r3 -> setTaint dst (join3 (getTaint r1 st) (getTaint r2 st)
+ (getTaint r3 st)) st
+ Msub dst r1 r2 r3 -> setTaint dst (join3 (getTaint r1 st) (getTaint r2 st)
+ (getTaint r3 st)) st
+ Umulh dst r1 r2 -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Smulh dst r1 r2 -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Udiv dst r1 r2 -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Sdiv dst r1 r2 -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+
+ -- Logical: result is join of operand taints
+ And dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Orr dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Eor dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Bic dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Mvn dst op -> setTaint dst (operandTaint op st) st
+ Tst _ _ -> st -- No destination
+
+ -- Shifts: result is join of operand taints
+ Lsl dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Lsr dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Asr dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+ Ror dst r1 op -> setTaint dst (join2 (getTaint r1 st) (operandTaint op st)) st
+
+ -- Bit manipulation
+ Ubfx dst r1 _ _ -> setTaint dst (getTaint r1 st) st
+ Sbfx dst r1 _ _ -> setTaint dst (getTaint r1 st) st
+ Bfi dst r1 _ _ -> setTaint dst (join2 (getTaint dst st) (getTaint r1 st)) st
+ Extr dst r1 r2 _ -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+
+ -- Address generation: result is public (constant pool / PC-relative)
+ Adr dst _ -> setTaint dst Public st
+ Adrp dst _ -> setTaint dst Public st
+
+ -- Loads: destination becomes Unknown (memory content is unknown)
+ -- This is conservative; could refine if we track memory regions
+ Ldr dst _ -> setTaint dst Unknown st
+ Ldrb dst _ -> setTaint dst Unknown st
+ Ldrh dst _ -> setTaint dst Unknown st
+ Ldrsb dst _ -> setTaint dst Unknown st
+ Ldrsh dst _ -> setTaint dst Unknown st
+ Ldrsw dst _ -> setTaint dst Unknown st
+ Ldp dst1 dst2 _ -> setTaint dst1 Unknown (setTaint dst2 Unknown st)
+
+ -- Stores: no destination register change
+ Str _ _ -> st
+ Strb _ _ -> st
+ Strh _ _ -> st
+ Stp _ _ _ -> st
+
+ -- Conditionals: conservative join
+ Cmp _ _ -> st
+ Cmn _ _ -> st
+ Csel dst r1 r2 _ -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Csinc dst r1 r2 _ -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Csinv dst r1 r2 _ -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Csneg dst r1 r2 _ -> setTaint dst (join2 (getTaint r1 st) (getTaint r2 st)) st
+ Cset dst _ -> setTaint dst Public st -- Condition flag derived
+ Cinc dst r1 _ -> setTaint dst (getTaint r1 st) st
+
+ -- Branches: no register change
+ B _ -> st
+ BCond _ _ -> st
+ Bl _ -> invalidateCallerSaved st -- Call may clobber
+ Blr _ -> invalidateCallerSaved st
+ Br _ -> st
+ Ret _ -> st
+ Cbz _ _ -> st
+ Cbnz _ _ -> st
+ Tbz _ _ _ -> st
+ Tbnz _ _ _ -> st
+
+ -- System
+ Nop -> st
+ Svc _ -> invalidateCallerSaved st
+
+ -- Unknown instruction: conservative
+ Other _ _ -> st
+
+-- | Set taint for a register.
+setTaint :: Reg -> Taint -> TaintState -> TaintState
+setTaint = Map.insert
+
+-- | Get taint of an operand.
+operandTaint :: Operand -> TaintState -> Taint
+operandTaint op st = case op of
+ OpReg r -> getTaint r st
+ OpImm _ -> Public
+ OpShiftedReg r _ -> getTaint r st
+ OpExtendedReg r _ -> getTaint r st
+ OpLabel _ -> Public
+ OpAddr addr -> addrBaseTaint addr st
+
+-- | Get taint of address base register.
+addrBaseTaint :: AddrMode -> TaintState -> Taint
+addrBaseTaint addr st = case addr of
+ BaseImm r _ -> getTaint r st
+ BaseReg r _ -> getTaint r st
+ BaseRegShift r _ _ -> getTaint r st
+ BaseRegExtend r _ _ -> getTaint r st
+ PreIndex r _ -> getTaint r st
+ PostIndex r _ -> getTaint r st
+ Literal _ -> Public
+
+-- | Join two taints.
+join2 :: Taint -> Taint -> Taint
+join2 = joinTaint
+
+-- | Join three taints.
+join3 :: Taint -> Taint -> Taint -> Taint
+join3 a b c = joinTaint a (joinTaint b c)
+
+-- | Invalidate caller-saved registers after a call.
+-- Per AArch64 ABI, x0-x17 are caller-saved.
+invalidateCallerSaved :: TaintState -> TaintState
+invalidateCallerSaved st = foldr (\r -> Map.insert r Unknown) st callerSaved
+ where
+ callerSaved =
+ [ X0, X1, X2, X3, X4, X5, X6, X7
+ , X8, X9, X10, X11, X12, X13, X14, X15
+ , X16, X17
+ ]
diff --git a/lib/Audit/AArch64/Types.hs b/lib/Audit/AArch64/Types.hs
@@ -0,0 +1,271 @@
+{-# OPTIONS_HADDOCK prune #-}
+{-# LANGUAGE DeriveGeneric #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+-- |
+-- Module: Audit.AArch64.Types
+-- Copyright: (c) 2025 Jared Tobin
+-- License: MIT
+-- Maintainer: jared@ppad.tech
+--
+-- Core types for AArch64 assembly analysis.
+
+module Audit.AArch64.Types (
+ -- * Registers
+ Reg(..)
+ , regName
+
+ -- * Operands and addressing
+ , Shift(..)
+ , Extend(..)
+ , Operand(..)
+ , AddrMode(..)
+
+ -- * Instructions
+ , Instr(..)
+ , Line(..)
+
+ -- * Taint lattice
+ , Taint(..)
+ , joinTaint
+
+ -- * Analysis results
+ , Violation(..)
+ , ViolationReason(..)
+ ) where
+
+import Data.Aeson (ToJSON(..), (.=), object)
+import Data.Text (Text)
+import qualified Data.Text as T
+import GHC.Generics (Generic)
+
+-- | AArch64 registers (64-bit, 32-bit views, and SIMD).
+data Reg
+ -- 64-bit general purpose
+ = X0 | X1 | X2 | X3 | X4 | X5 | X6 | X7
+ | X8 | X9 | X10 | X11 | X12 | X13 | X14 | X15
+ | X16 | X17 | X18 | X19 | X20 | X21 | X22 | X23
+ | X24 | X25 | X26 | X27 | X28 | X29 | X30
+ -- 32-bit views
+ | W0 | W1 | W2 | W3 | W4 | W5 | W6 | W7
+ | W8 | W9 | W10 | W11 | W12 | W13 | W14 | W15
+ | W16 | W17 | W18 | W19 | W20 | W21 | W22 | W23
+ | W24 | W25 | W26 | W27 | W28 | W29 | W30
+ -- Special
+ | SP -- ^ Stack pointer
+ | XZR -- ^ Zero register (64-bit)
+ | WZR -- ^ Zero register (32-bit)
+ -- SIMD/FP (64-bit double)
+ | D0 | D1 | D2 | D3 | D4 | D5 | D6 | D7
+ | D8 | D9 | D10 | D11 | D12 | D13 | D14 | D15
+ | D16 | D17 | D18 | D19 | D20 | D21 | D22 | D23
+ | D24 | D25 | D26 | D27 | D28 | D29 | D30 | D31
+ -- SIMD/FP (32-bit single)
+ | S0 | S1 | S2 | S3 | S4 | S5 | S6 | S7
+ | S8 | S9 | S10 | S11 | S12 | S13 | S14 | S15
+ | S16 | S17 | S18 | S19 | S20 | S21 | S22 | S23
+ | S24 | S25 | S26 | S27 | S28 | S29 | S30 | S31
+ -- SIMD/FP (128-bit quad)
+ | Q0 | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7
+ | Q8 | Q9 | Q10 | Q11 | Q12 | Q13 | Q14 | Q15
+ | Q16 | Q17 | Q18 | Q19 | Q20 | Q21 | Q22 | Q23
+ | Q24 | Q25 | Q26 | Q27 | Q28 | Q29 | Q30 | Q31
+ deriving (Eq, Ord, Show, Generic)
+
+instance ToJSON Reg where
+ toJSON = toJSON . regName
+
+-- | Canonical lowercase register name.
+regName :: Reg -> Text
+regName r = T.toLower (T.pack (show r))
+
+-- | Shift operations for indexed addressing.
+data Shift
+ = LSL !Int -- ^ Logical shift left
+ | LSR !Int -- ^ Logical shift right
+ | ASR !Int -- ^ Arithmetic shift right
+ deriving (Eq, Show, Generic)
+
+instance ToJSON Shift
+
+-- | Extend operations.
+data Extend
+ = UXTW -- ^ Unsigned extend word
+ | SXTW -- ^ Signed extend word
+ | UXTX -- ^ Unsigned extend doubleword
+ | SXTX -- ^ Signed extend doubleword
+ deriving (Eq, Show, Generic)
+
+instance ToJSON Extend
+
+-- | Instruction operand.
+data Operand
+ = OpReg !Reg -- ^ Register operand
+ | OpImm !Integer -- ^ Immediate value
+ | OpShiftedReg !Reg !Shift -- ^ Shifted register
+ | OpExtendedReg !Reg !Extend -- ^ Extended register
+ | OpLabel !Text -- ^ Label reference
+ | OpAddr !AddrMode -- ^ Memory address
+ deriving (Eq, Show, Generic)
+
+instance ToJSON Operand
+
+-- | Memory addressing modes.
+data AddrMode
+ = BaseImm !Reg !Integer -- ^ [xN, #imm] or [xN]
+ | BaseReg !Reg !Reg -- ^ [xN, xM]
+ | BaseRegShift !Reg !Reg !Shift -- ^ [xN, xM, lsl #k]
+ | BaseRegExtend !Reg !Reg !Extend -- ^ [xN, xM, sxtw]
+ | PreIndex !Reg !Integer -- ^ [xN, #imm]!
+ | PostIndex !Reg !Integer -- ^ [xN], #imm
+ | Literal !Text -- ^ =label or PC-relative
+ deriving (Eq, Show, Generic)
+
+instance ToJSON AddrMode
+
+-- | Parsed assembly instruction.
+data Instr
+ -- Data movement
+ = Mov !Reg !Operand
+ | Movz !Reg !Integer !(Maybe Shift)
+ | Movk !Reg !Integer !(Maybe Shift)
+ | Movn !Reg !Integer !(Maybe Shift)
+ -- Arithmetic
+ | Add !Reg !Reg !Operand
+ | Sub !Reg !Reg !Operand
+ | Adds !Reg !Reg !Operand
+ | Subs !Reg !Reg !Operand
+ | Adc !Reg !Reg !Reg
+ | Sbc !Reg !Reg !Reg
+ | Neg !Reg !Operand
+ | Mul !Reg !Reg !Reg
+ | Madd !Reg !Reg !Reg !Reg
+ | Msub !Reg !Reg !Reg !Reg
+ | Umulh !Reg !Reg !Reg
+ | Smulh !Reg !Reg !Reg
+ | Udiv !Reg !Reg !Reg
+ | Sdiv !Reg !Reg !Reg
+ -- Logical
+ | And !Reg !Reg !Operand
+ | Orr !Reg !Reg !Operand
+ | Eor !Reg !Reg !Operand
+ | Bic !Reg !Reg !Operand
+ | Mvn !Reg !Operand
+ | Tst !Reg !Operand
+ -- Shifts
+ | Lsl !Reg !Reg !Operand
+ | Lsr !Reg !Reg !Operand
+ | Asr !Reg !Reg !Operand
+ | Ror !Reg !Reg !Operand
+ -- Bit manipulation
+ | Ubfx !Reg !Reg !Int !Int
+ | Sbfx !Reg !Reg !Int !Int
+ | Bfi !Reg !Reg !Int !Int
+ | Extr !Reg !Reg !Reg !Int
+ -- Address generation
+ | Adr !Reg !Text
+ | Adrp !Reg !Text
+ -- Load/Store
+ | Ldr !Reg !AddrMode
+ | Ldrb !Reg !AddrMode
+ | Ldrh !Reg !AddrMode
+ | Ldrsb !Reg !AddrMode
+ | Ldrsh !Reg !AddrMode
+ | Ldrsw !Reg !AddrMode
+ | Str !Reg !AddrMode
+ | Strb !Reg !AddrMode
+ | Strh !Reg !AddrMode
+ | Ldp !Reg !Reg !AddrMode
+ | Stp !Reg !Reg !AddrMode
+ -- Compare and select
+ | Cmp !Reg !Operand
+ | Cmn !Reg !Operand
+ | Csel !Reg !Reg !Reg !Text -- condition code
+ | Csinc !Reg !Reg !Reg !Text
+ | Csinv !Reg !Reg !Reg !Text
+ | Csneg !Reg !Reg !Reg !Text
+ | Cset !Reg !Text
+ | Cinc !Reg !Reg !Text
+ -- Branches
+ | B !Text
+ | BCond !Text !Text -- condition, target
+ | Bl !Text
+ | Blr !Reg
+ | Br !Reg
+ | Ret !(Maybe Reg)
+ | Cbz !Reg !Text
+ | Cbnz !Reg !Text
+ | Tbz !Reg !Int !Text
+ | Tbnz !Reg !Int !Text
+ -- System
+ | Nop
+ | Svc !Integer
+ -- Other (unparsed but recorded)
+ | Other !Text ![Text]
+ deriving (Eq, Show, Generic)
+
+instance ToJSON Instr
+
+-- | A line of assembly (with source location).
+data Line = Line
+ { lineNum :: !Int
+ , lineLabel :: !(Maybe Text)
+ , lineInstr :: !(Maybe Instr)
+ } deriving (Eq, Show, Generic)
+
+instance ToJSON Line
+
+-- | Taint lattice for register publicness.
+data Taint
+ = Public -- ^ Known to be public (derived from stack/heap pointers)
+ | Secret -- ^ Known or assumed to be secret
+ | Unknown -- ^ Not yet determined
+ deriving (Eq, Ord, Show, Generic)
+
+instance ToJSON Taint
+
+-- | Join operation for taint lattice (least upper bound).
+-- Unknown < Public < Secret
+joinTaint :: Taint -> Taint -> Taint
+joinTaint Secret _ = Secret
+joinTaint _ Secret = Secret
+joinTaint Public _ = Public
+joinTaint _ Public = Public
+joinTaint Unknown Unknown = Unknown
+
+-- | Reason for a violation.
+data ViolationReason
+ = SecretBase !Reg -- ^ Base register is secret
+ | SecretIndex !Reg -- ^ Index register is secret
+ | UnknownBase !Reg -- ^ Base register taint unknown
+ | UnknownIndex !Reg -- ^ Index register taint unknown
+ | NonConstOffset -- ^ Non-constant offset without masking
+ deriving (Eq, Show, Generic)
+
+instance ToJSON ViolationReason where
+ toJSON (SecretBase r) =
+ object ["type" .= ("secret_base" :: Text), "register" .= r]
+ toJSON (SecretIndex r) =
+ object ["type" .= ("secret_index" :: Text), "register" .= r]
+ toJSON (UnknownBase r) =
+ object ["type" .= ("unknown_base" :: Text), "register" .= r]
+ toJSON (UnknownIndex r) =
+ object ["type" .= ("unknown_index" :: Text), "register" .= r]
+ toJSON NonConstOffset =
+ object ["type" .= ("non_const_offset" :: Text)]
+
+-- | A memory access violation.
+data Violation = Violation
+ { vSymbol :: !Text
+ , vLine :: !Int
+ , vInstr :: !Instr
+ , vReason :: !ViolationReason
+ } deriving (Eq, Show, Generic)
+
+instance ToJSON Violation where
+ toJSON v = object
+ [ "symbol" .= vSymbol v
+ , "line" .= vLine v
+ , "instr" .= show (vInstr v)
+ , "reason" .= vReason v
+ ]
diff --git a/plans/ARCH0.md b/plans/ARCH0.md
@@ -0,0 +1,101 @@
+# ARCH0: AArch64 ASM Constant-Time Memory Access Audit
+
+## Goal
+
+Prototype a static analyzer that ingests GHC+LLVM aarch64 assembly
+and proves a narrow property: all memory accesses are based on public
+registers plus constant offsets (or other whitelisted addressing
+modes), and no loads/stores use secret-derived addresses.
+
+This is an automated sanity check, not a full constant-time proof.
+
+## Scope
+
+- Target: aarch64-darwin GHC output (LLVM backend), no inline asm.
+- Input: text assembly dumps (.s or llc output) with symbols.
+- Property: memory access address derivations are public and bounded.
+- Non-goals: micro-architectural timing proof, full CT branching proof.
+
+## Assumptions
+
+- GHC calling convention for aarch64 is stable enough to whitelist
+ public base registers (e.g. sp, STG stack/heap pointers, RTS regs).
+- LLVM does not introduce secret-dependent memory addressing without
+ explicit arithmetic in the asm.
+- External calls are treated as opaque unless summarized.
+
+## Prototype Architecture
+
+1) Parser
+- Tokenize instructions, labels, and directives from .s.
+- Extract instruction mnemonics and operands.
+- Normalize addressing modes: [xN, #imm], [xN, xM], [xN, xM, lsl #k].
+
+2) CFG and basic blocks
+- Identify basic blocks by labels and control flow instructions.
+- Build a simple CFG for forward dataflow; no SSA needed.
+
+3) Taint lattice (publicness)
+- States per register: Public, Secret, Unknown.
+- Initialize public roots: sp, fp, STG stack/heap pointers, TLS regs,
+ and known constant pools; all others Unknown.
+- Propagate for arithmetic and moves (e.g. add, sub, mov, lsl, orr).
+- Conservative joins across CFG edges.
+
+4) Memory access rules
+- For each load/store:
+ - Base register must be Public.
+ - Index register (if any) must be Public and either constant or
+ masked to a fixed range (configurable).
+ - Offset must be constant immediate unless whitelisted pattern.
+- Flag any instruction that violates the above.
+
+5) Whitelisting and summaries
+- Config file for:
+ - public root registers per platform and GHC version.
+ - allowed addressing modes.
+ - known-safe helper symbols (e.g. RTS/GC stubs).
+
+6) Reporting
+- Emit a report with:
+ - first offending instruction per symbol,
+ - reason (base not public, non-const offset, etc.),
+ - suggested remediation.
+
+## Prototype Rule Set (initial)
+
+- Public roots: sp, fp/x29, x28 (STG stack ptr), x27 (STG heap ptr),
+ x19-x26 when used as GHC-prescribed callee-saves, x18 (TLS).
+- Public constants: literal pools, constant addresses from adrp/add.
+- Allowed addr modes:
+ - [xN, #imm]
+ - [xN, xM] only if xM is Public and range-checked via and/lsr mask
+ - [xN, xM, lsl #k] only if xM is Public and range-checked
+
+## Prototype Flow
+
+- Input: .s file
+- Output: JSON + text summary
+- CLI: arch-audit --ghc-aarch64 --input foo.s --config cfg.toml
+
+## Validation Plan
+
+- Create small, controlled asm fixtures:
+ - known-good (stack and GHC heap accesses only)
+ - known-bad (secret-derived base or index)
+- Compare findings to manual inspection of ppad-fixed and ppad-secp256k1
+ dumps; check for false positives and negatives.
+
+## Risks and Limitations
+
+- Register reuse across calls makes publicness hard to track without
+ interprocedural summaries.
+- LLVM may fold complex addressing in ways that require more semantics.
+- This does not prove constant-time; it only enforces a strict memory
+ access policy.
+
+## Next Steps
+
+- Confirm the exact GHC aarch64 register roles to seed the whitelist.
+- Pick a parsing strategy (simple regex vs. small grammar).
+- Implement a minimal dataflow engine and run on one target dump.
diff --git a/plans/IMPL0.md b/plans/IMPL0.md
@@ -0,0 +1,67 @@
+# IMPL0: AArch64 ASM Constant-Time Memory Access Audit Tool
+
+## Goal
+
+Implement a Haskell (GHC 9.10.3) prototype static analyzer for
+AArch64 GHC+LLVM assembly that enforces a strict memory-access policy
+(public base register + constant offset or whitelisted addressing).
+
+## Inputs and References
+
+- Model project structure, flake layout, and cabal style after
+ `../sha256/flake.nix` and `../sha256/ppad-sha256.cabal`.
+- Figure out the register map to use for GHC 9.10.3 by locating the
+ public AArch64 calling-convention details in GHC headers.
+
+## Implementation Steps
+
+1) Scaffold project layout
+- Mirror ppad project structure (lib/, test/, bench/, etc.).
+- Draft `flake.nix` patterned after `../sha256/flake.nix`, tailored to
+ this tool (GHC 9.10.3, LLVM 19, cabal).
+- Draft the `.cabal` file patterned after `../sha256/ppad-sha256.cabal`.
+
+2) Register map research (GHC 9.10.3)
+- Locate the AArch64 calling convention definitions in GHC source,
+ likely in RTS headers or codegen headers (e.g. `rts/include/` or
+ `compiler/`), and confirm register roles for:
+ - STG stack pointer, heap pointer, TLS, base reg, etc.
+- Produce a concrete register whitelist and document it in code and
+ README.
+
+3) Parser and instruction model
+- Implement a minimal assembly tokenizer/parser for AArch64 `.s`.
+- Normalize addressing modes into a small sum type.
+- Represent labels, instructions, and operands with explicit types.
+
+4) CFG and dataflow engine
+- Build basic blocks and a CFG.
+- Implement a simple forward dataflow analysis with a 3-point lattice:
+ Public | Secret | Unknown.
+- Initialize public roots using the GHC register map.
+
+5) Memory access audit rules
+- Check all loads/stores for base/index publicness and constant offset.
+- Add a configurable whitelist for safe symbols and addressing patterns.
+- Emit structured findings (symbol, line, reason).
+
+6) CLI and reporting
+- Provide a small CLI (input asm path, config path, output format).
+- Emit both a human summary and JSON output for automation.
+
+7) Tests and fixtures
+- Create small assembly fixtures with known-good and known-bad accesses.
+- Add unit tests for parser, CFG builder, and audit rules.
+
+## Deliverables
+
+- Haskell library with core analysis logic in `lib/`.
+- CLI executable in `app/`.
+- Initial config file (TOML or YAML) with register map and whitelist.
+- Tests under `test/` and minimal fixtures under `test/fixtures/`.
+
+## Open Questions
+
+- Exact GHC 9.10.3 AArch64 register map and public-root definitions.
+- Whether to depend on a parsing library (megaparsec) or keep it minimal
+ with hand-rolled parsing to avoid extra deps.
diff --git a/ppad-auditor.cabal b/ppad-auditor.cabal
@@ -0,0 +1,67 @@
+cabal-version: 3.0
+name: ppad-auditor
+version: 0.1.0
+synopsis: AArch64 constant-time memory access auditor
+license: MIT
+license-file: LICENSE
+author: Jared Tobin
+maintainer: jared@ppad.tech
+category: Development
+build-type: Simple
+tested-with: GHC == 9.10.3
+description:
+ Static analyzer for AArch64 GHC+LLVM assembly that enforces a strict
+ memory-access policy (public base register + constant offset) to help
+ verify constant-time properties.
+
+source-repository head
+ type: git
+ location: git.ppad.tech/auditor.git
+
+library
+ default-language: Haskell2010
+ hs-source-dirs: lib
+ ghc-options:
+ -Wall
+ exposed-modules:
+ Audit.AArch64
+ Audit.AArch64.Types
+ Audit.AArch64.Parser
+ Audit.AArch64.CFG
+ Audit.AArch64.Taint
+ Audit.AArch64.Check
+ build-depends:
+ base >= 4.9 && < 5
+ , bytestring >= 0.9 && < 0.13
+ , containers >= 0.6 && < 0.8
+ , megaparsec >= 9.0 && < 10
+ , text >= 1.2 && < 2.2
+ , aeson >= 2.0 && < 2.3
+
+executable auditor
+ default-language: Haskell2010
+ hs-source-dirs: app
+ main-is: Main.hs
+ ghc-options:
+ -Wall -O2 -threaded
+ build-depends:
+ base
+ , bytestring
+ , optparse-applicative >= 0.16 && < 0.19
+ , ppad-auditor
+ , text
+
+test-suite auditor-tests
+ type: exitcode-stdio-1.0
+ default-language: Haskell2010
+ hs-source-dirs: test
+ main-is: Main.hs
+ ghc-options:
+ -rtsopts -Wall -O2
+ build-depends:
+ base
+ , bytestring
+ , ppad-auditor
+ , tasty
+ , tasty-hunit
+ , text
diff --git a/test/Main.hs b/test/Main.hs
@@ -0,0 +1,168 @@
+{-# LANGUAGE OverloadedStrings #-}
+
+module Main where
+
+import Audit.AArch64
+import Audit.AArch64.Parser
+import Audit.AArch64.Taint
+import Audit.AArch64.Types
+import Data.Text (Text)
+import qualified Data.Text as T
+import Test.Tasty
+import Test.Tasty.HUnit
+
+main :: IO ()
+main = defaultMain $ testGroup "ppad-auditor" [
+ parserTests
+ , taintTests
+ , auditTests
+ ]
+
+-- Parser tests
+
+parserTests :: TestTree
+parserTests = testGroup "Parser" [
+ testCase "parse label" $ do
+ let src = "foo:\n"
+ case parseAsm src of
+ Left _ -> assertFailure "parse failed"
+ Right lns -> do
+ assertEqual "line count" 1 (length lns)
+ assertEqual "label" (Just "foo") (lineLabel (head lns))
+
+ , testCase "parse ldr base+imm" $ do
+ let src = "ldr x0, [x20, #8]\n"
+ case parseAsm src of
+ Left _ -> assertFailure "parse failed"
+ Right lns -> case lineInstr (head lns) of
+ Just (Ldr X0 (BaseImm X20 8)) -> pure ()
+ other -> assertFailure $ "unexpected: " ++ show other
+
+ , testCase "parse str base+reg" $ do
+ let src = "str x1, [x21, x2]\n"
+ case parseAsm src of
+ Left _ -> assertFailure "parse failed"
+ Right lns -> case lineInstr (head lns) of
+ Just (Str X1 (BaseReg X21 X2)) -> pure ()
+ other -> assertFailure $ "unexpected: " ++ show other
+
+ , testCase "parse add" $ do
+ let src = "add x0, x1, x2\n"
+ case parseAsm src of
+ Left _ -> assertFailure "parse failed"
+ Right lns -> case lineInstr (head lns) of
+ Just (Add X0 X1 (OpReg X2)) -> pure ()
+ other -> assertFailure $ "unexpected: " ++ show other
+
+ , testCase "parse adrp" $ do
+ let src = "adrp x5, _symbol@PAGE\n"
+ case parseAsm src of
+ Left _ -> assertFailure "parse failed"
+ Right lns -> case lineInstr (head lns) of
+ Just (Adrp X5 _) -> pure ()
+ other -> assertFailure $ "unexpected: " ++ show other
+
+ , testCase "skip directives" $ do
+ let src = ".section __TEXT\n.globl _foo\n_foo:\n ret\n"
+ case parseAsm src of
+ Left e -> assertFailure $ "parse failed: " ++ show e
+ Right lns -> do
+ -- Should have 4 lines, first two are directives (no instr)
+ let instrs = filter ((/= Nothing) . lineInstr) lns
+ assertEqual "instruction count" 1 (length instrs)
+ ]
+
+-- Taint tests
+
+taintTests :: TestTree
+taintTests = testGroup "Taint" [
+ testCase "public roots" $ do
+ let st = initTaintState
+ assertEqual "X20 is public" Public (getTaint X20 st)
+ assertEqual "X21 is public" Public (getTaint X21 st)
+ assertEqual "SP is public" Public (getTaint SP st)
+
+ , testCase "unknown register" $ do
+ let st = initTaintState
+ assertEqual "X0 is unknown" Unknown (getTaint X0 st)
+
+ , testCase "mov propagates taint" $ do
+ let st = initTaintState
+ l = Line 1 Nothing (Just (Mov X0 (OpReg X20)))
+ st' = analyzeLine l st
+ assertEqual "X0 becomes public" Public (getTaint X0 st')
+
+ , testCase "add joins taints" $ do
+ let st = initTaintState
+ -- X0 = X20 (public) + X1 (unknown) -> unknown
+ l = Line 1 Nothing (Just (Add X0 X20 (OpReg X1)))
+ st' = analyzeLine l st
+ assertEqual "result is public" Public (getTaint X0 st')
+
+ , testCase "load makes unknown" $ do
+ let st = initTaintState
+ l = Line 1 Nothing (Just (Ldr X0 (BaseImm X20 0)))
+ st' = analyzeLine l st
+ assertEqual "loaded value is unknown" Unknown (getTaint X0 st')
+ ]
+
+-- Audit tests
+
+auditTests :: TestTree
+auditTests = testGroup "Audit" [
+ testCase "good: public base" $ do
+ let src = T.unlines
+ [ "foo:"
+ , " ldr x0, [x20, #8]"
+ , " str x0, [x21]"
+ , " ret"
+ ]
+ case audit "test" src of
+ Left e -> assertFailure $ "parse failed: " ++ show e
+ Right ar -> assertEqual "no violations" 0 (length (arViolations ar))
+
+ , testCase "bad: unknown base" $ do
+ let src = T.unlines
+ [ "foo:"
+ , " ldr x0, [x0]"
+ , " ret"
+ ]
+ case audit "test" src of
+ Left e -> assertFailure $ "parse failed: " ++ show e
+ Right ar -> do
+ assertEqual "one violation" 1 (length (arViolations ar))
+ case arViolations ar of
+ [v] -> case vReason v of
+ UnknownBase X0 -> pure ()
+ other -> assertFailure $ "wrong reason: " ++ show other
+ _ -> assertFailure "expected one violation"
+
+ , testCase "bad: secret-derived index" $ do
+ let src = T.unlines
+ [ "foo:"
+ , " ldr x0, [x20]" -- x0 = unknown (loaded value)
+ , " ldr x1, [x21, x0]" -- x0 as index -> violation
+ , " ret"
+ ]
+ case audit "test" src of
+ Left e -> assertFailure $ "parse failed: " ++ show e
+ Right ar -> do
+ assertEqual "one violation" 1 (length (arViolations ar))
+ case arViolations ar of
+ [v] -> case vReason v of
+ UnknownIndex X0 -> pure ()
+ other -> assertFailure $ "wrong reason: " ++ show other
+ _ -> assertFailure "expected one violation"
+
+ , testCase "good: adrp+add pattern" $ do
+ let src = T.unlines
+ [ "foo:"
+ , " adrp x8, _const@PAGE"
+ , " add x8, x8, _const@PAGEOFF"
+ , " ldr x0, [x8]"
+ , " ret"
+ ]
+ case audit "test" src of
+ Left e -> assertFailure $ "parse failed: " ++ show e
+ Right ar -> assertEqual "no violations" 0 (length (arViolations ar))
+ ]
diff --git a/test/fixtures/bad.s b/test/fixtures/bad.s
@@ -0,0 +1,31 @@
+; bad.s - Known-bad assembly (secret-derived memory accesses)
+; Should produce violations
+
+.section __TEXT,__text
+.globl _unsafe_function
+.p2align 2
+
+_unsafe_function:
+ ; Stack frame setup
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+
+ ; Load a secret value
+ ldr x0, [x20] ; x0 = secret from memory
+
+ ; BAD: Use secret as base address (cache-timing leak)
+ ldr x1, [x0] ; violation: unknown base x0
+
+ ; BAD: Use secret as index (cache-timing leak)
+ ldr x2, [x21, x0] ; violation: unknown index x0
+
+ ; BAD: Secret-derived address via arithmetic
+ add x3, x0, #256 ; x3 = secret + offset
+ ldr x4, [x3] ; violation: unknown base x3
+
+ ; BAD: Secret used in shifted index
+ ldr x5, [x21, x0, lsl #3] ; violation: unknown index x0
+
+ ; Frame teardown
+ ldp x29, x30, [sp], #16
+ ret
diff --git a/test/fixtures/good.s b/test/fixtures/good.s
@@ -0,0 +1,37 @@
+; good.s - Known-good assembly (stack/heap accesses only)
+; Should produce no violations
+
+.section __TEXT,__text
+.globl _safe_function
+.p2align 2
+
+_safe_function:
+ ; Stack frame setup
+ stp x29, x30, [sp, #-16]!
+ mov x29, sp
+
+ ; Load from GHC stack pointer (x20)
+ ldr x0, [x20, #8]
+ ldr x1, [x20, #16]
+
+ ; Store to GHC heap pointer (x21)
+ str x0, [x21]
+ str x1, [x21, #8]
+
+ ; Load from hardware stack
+ ldr x2, [sp, #24]
+
+ ; PC-relative constant pool
+ adrp x8, _constant@PAGE
+ add x8, x8, _constant@PAGEOFF
+ ldr x3, [x8]
+
+ ; Store pair to stack
+ stp x0, x1, [sp, #-32]!
+
+ ; Load pair from stack
+ ldp x4, x5, [sp], #32
+
+ ; Frame teardown
+ ldp x29, x30, [sp], #16
+ ret