poly1305

The Poly1305 message authentication code (docs.ppad.tech/poly1305).
git clone git://git.ppad.tech/poly1305.git
Log | Files | Refs | README | LICENSE

commit c2fb09755dd7213dc2ce5532c164aacb94982ce4
parent 39477dfcbd306f38a42a1bd1981009d912ea12a9
Author: Jared Tobin <jared@jtobin.io>
Date:   Sat, 16 May 2026 13:06:36 -0230

lib: dispatch mac to ARM path when available

Wire 'Crypto.MAC.Poly1305.mac' to the ARM-accelerated 'Arm.mac'
when 'poly1305_arm_available' is True; otherwise fall through to
the existing pure Haskell scalar implementation.  Mirrors the
dispatch pattern in 'Crypto.Hash.SHA256.hs', 'Crypto.Cipher.
ChaCha20.hs', and 'Data.ByteString.Base16.hs':

    mac key@(BI.PS _ _ kl) msg
      | kl /= 32  = Nothing
      | Arm.poly1305_arm_available =
          pure $! MAC (Arm.mac key msg)
      | otherwise = ... scalar ...

Length validation stays in the dispatcher.  The Arm wrapper
assumes a 32-byte key.

Performance on the 114-byte RFC 8439 test vector (M4 MacBook Air,
GHC 9.10.3 + LLVM 19, '-fllvm'):

  mac (small key):  124 ns ->  66 ns   (~1.9x, stage 1 alone)
  mac (mid key):    125 ns ->  66 ns
  mac (big key):    124 ns ->  66 ns

Allocation per call drops as well: the scalar Haskell implementation
allocates through 'Wider' / 'Limb' wrappers and assorted
intermediate values; the C path allocates only the 16-byte MAC
output (~256 B per call including bytestring overhead, vs ~640+ B
previously).

All 12 tasty cases (including RFC 8439 A.3 vectors 1-11) pass
through the dispatched path, both under '-fllvm' and under
'-fllvm -fsanitize' (ASan + UBSan over the C kernel — no
diagnostics).

The next commit replaces the scalar inner block loop with a NEON
4-way parallel kernel.

Diffstat:
Mlib/Crypto/MAC/Poly1305.hs | 3+++
1 file changed, 3 insertions(+), 0 deletions(-)

diff --git a/lib/Crypto/MAC/Poly1305.hs b/lib/Crypto/MAC/Poly1305.hs @@ -26,6 +26,7 @@ module Crypto.MAC.Poly1305 ( , _roll16 ) where +import qualified Crypto.MAC.Poly1305.Arm as Arm import qualified Data.Bits as B import qualified Data.ByteString as BS import qualified Data.ByteString.Internal as BI @@ -173,6 +174,8 @@ mac -> Maybe MAC -- ^ 128-bit message authentication code mac key@(BI.PS _ _ kl) msg | kl /= 32 = Nothing + | Arm.poly1305_arm_available = + pure $! MAC (Arm.mac key msg) | otherwise = let (clamp . _roll16 -> r, _roll16 -> s) = BS.splitAt 16 key in pure $! (MAC (_poly1305_loop r s msg))