ARCH3.md (5782B)
1 # ARCH3: Benchmark Suite Expansion 2 3 ## Overview 4 5 Expand the benchmark suite to cover the full library surface. Current 6 benchmarks focus on key derivation, secret generation, fee calculation, 7 and trimming predicates. Missing coverage for transaction building, 8 script generation, serialization, parsing, validation, and secret 9 storage operations. 10 11 ## Current Coverage 12 13 Measured in `bench/Main.hs` and `bench/Weight.hs`: 14 15 - Key derivation: `derive_pubkey`, `derive_revocationpubkey` 16 - Secret generation: `generate_from_seed` 17 - Fee calculation: `commitment_fee`, `htlc_timeout_fee`, `htlc_success_fee` 18 - Trimming: `is_trimmed`, `htlc_trim_threshold` 19 20 ## Proposed Additions 21 22 ### 1. Transaction Building 23 24 Primary entry points for transaction construction. These compose 25 scripts, compute fees, filter HTLCs, and sort outputs. 26 27 Functions: 28 - `build_commitment_tx` — most complex; involves HTLC filtering, script 29 generation, fee deduction, output sorting 30 - `build_htlc_timeout_tx` — single-input, single-output second-stage tx 31 - `build_htlc_success_tx` — single-input, single-output second-stage tx 32 - `build_closing_tx` — cooperative close with BIP69 ordering 33 - `build_legacy_closing_tx` — legacy cooperative close 34 35 Benchmark variants: 36 - Commitment tx with 0 HTLCs (baseline) 37 - Commitment tx with 10 HTLCs (realistic) 38 - Commitment tx with 100 HTLCs (stress) 39 - With/without option_anchors 40 41 ### 2. Script Generation 42 43 Witness scripts are generated during transaction building but worth 44 measuring in isolation to identify bottlenecks. 45 46 Functions: 47 - `funding_script` — 2-of-2 multisig 48 - `to_local_script` — revocation + delayed spend 49 - `to_remote_script` — P2WPKH or P2WSH depending on anchors 50 - `anchor_script` — anchor output with CHECKSIG + CSV 51 - `offered_htlc_script` — offered HTLC with timeout path 52 - `received_htlc_script` — received HTLC with preimage path 53 54 ### 3. Serialization 55 56 Encoding and decoding are performance-critical for signing workflows 57 and transaction relay. 58 59 Functions: 60 - `encode_tx` — commitment tx to bytes 61 - `encode_htlc_tx` — HTLC tx to bytes 62 - `encode_closing_tx` — closing tx to bytes 63 - `encode_tx_for_signing` — sighash preimage serialization 64 - `decode_tx` — parse raw bytes to RawTx 65 66 Benchmark variants: 67 - Encode commitment tx (0, 10, 100 HTLCs) 68 - Decode commitment tx (0, 10, 100 HTLCs) 69 - Roundtrip: encode then decode 70 71 ### 4. Validation 72 73 Stateless validation for transaction correctness. Useful to benchmark 74 as these may be called frequently during channel state updates. 75 76 Functions: 77 - `validate_commitment_tx` 78 - `validate_htlc_tx` 79 - `validate_closing_tx` 80 - `validate_output_ordering` 81 - `validate_dust_limits` 82 - `validate_commitment_fee` 83 84 ### 5. Secret Storage 85 86 Per-commitment secret storage uses an efficient tree structure. 87 Worth measuring insert and derive operations. 88 89 Functions: 90 - `insert_secret` — insert new secret at index 91 - `derive_old_secret` — derive secret at arbitrary past index 92 93 Benchmark variants: 94 - Insert sequence of 1000 secrets 95 - Derive secrets at various depths 96 - Store utilization at near-capacity 97 98 ### 6. Output Sorting 99 100 BIP69 output ordering with CLTV tiebreaker for HTLCs. 101 102 Functions: 103 - `sort_outputs` — sort list of TxOutput 104 105 Benchmark variants: 106 - Sort 10, 100 outputs 107 108 ## NFData Instances 109 110 New types requiring NFData for benchmarking: 111 112 - `CommitmentTx` 113 - `CommitmentContext` 114 - `CommitmentKeys` 115 - `HTLCTx` 116 - `HTLCContext` 117 - `ClosingTx` 118 - `ClosingContext` 119 - `TxOutput` 120 - `OutputType` 121 - `Script` 122 - `Witness` 123 - `Outpoint` 124 - `Sequence` 125 - `Locktime` 126 - `RawTx` 127 - `RawInput` 128 - `RawOutput` 129 - `SecretStore` 130 - `SecretEntry` 131 - Basepoint newtypes (already partially covered) 132 133 ## Test Fixtures 134 135 Realistic fixtures should be defined in a shared `where` block or 136 helper module. Key fixture components: 137 138 1. **Keys**: Valid secp256k1 pubkeys for all roles (local, remote, 139 revocation, HTLC, funding). Use test vectors from BOLT #3 appendix. 140 141 2. **Funding outpoint**: Fixed txid + vout. 142 143 3. **HTLCs**: Lists of 0, 10, 100 HTLCs with varied amounts and expiries. 144 Mix of offered/received directions. 145 146 4. **Channel parameters**: Realistic values for dust limit (546 sat), 147 feerate (5000 sat/kw), to_self_delay (144 blocks). 148 149 5. **Commitment context**: Full CommitmentContext with all keys 150 populated. 151 152 6. **Raw transaction bytes**: Pre-serialized transactions for decode 153 benchmarks. 154 155 ## Benchmark Organization 156 157 Organize benchmarks into logical groups matching library modules: 158 159 ``` 160 main = defaultMain [ 161 bgroup "key derivation" [ ... ] -- existing 162 , bgroup "secret generation" [ ... ] -- existing 163 , bgroup "secret storage" [ ... ] -- NEW 164 , bgroup "fee calculation" [ ... ] -- existing 165 , bgroup "trimming" [ ... ] -- existing 166 , bgroup "script generation" [ ... ] -- NEW 167 , bgroup "tx building" [ ... ] -- NEW 168 , bgroup "serialization" [ ... ] -- NEW 169 , bgroup "parsing" [ ... ] -- NEW 170 , bgroup "validation" [ ... ] -- NEW 171 , bgroup "output sorting" [ ... ] -- NEW 172 ] 173 ``` 174 175 ## Allocation Tracking 176 177 Mirror all criterion benchmarks in `bench/Weight.hs` using weigh. 178 This helps identify allocation regressions. 179 180 ## Success Criteria 181 182 - All exported transaction building functions benchmarked 183 - All exported script generation functions benchmarked 184 - Encode/decode roundtrip for all tx types 185 - Validation functions with valid and invalid inputs 186 - Secret storage under realistic load 187 - NFData instances for all benchmarked types 188 - No new external dependencies 189 190 ## Risks 191 192 - Large fixture setup may dominate small function benchmarks; use 193 `env` to separate setup from measurement 194 - NFData instances for recursive structures (SecretStore) require care 195 - Some functions may be too fast to measure reliably; use `whnf` vs 196 `nf` appropriately