Fastest way to count trailing zeros from a SIMD[Bool,32]? - Modular