i would think serial divide vs SIMD divide would not make a difference
i would think serial divide vs SIMD divide would not make a difference
Log we have the fast path which is the normal scalar algorithmscalar floating-point divide is [11; 12] cycles[33; 36] cyclespacked floating-point divide is also [11; 12] cyclescount parallel unitsLogscalar floating-point divide[11; 12] cycles[11; 12] cycles[33; 36] cyclespacked floating-point dividecount