Happy SIMD in Rust (without getting a stroke)
SIMD (Single Instruction, Multiple Data) lets CPUs process several values at once using wide vector registers, and in Rust you can approach it via wide, std::simd (portable SIMD), or low-level core::arch intrinsics. On a tiny vector-add example the speedups are modest (and manual intrinsics can even go wrong, as the broken u64 NEON result shows), so benchmarking and correctness matter more than “SIMD everywhere.” A more realistic win is Ziggurat normal sampling, where the fast accept test is highly SIMD-friendly and you only fall back to expensive scalar fixups for the rare rejected lanes.
author=Daniel Boros read=12min views=42