# vectorize changes the result of float operations

I just discovered that vectorize can change the result of floating operations. As @Maxim rightly pointed out to me, with floating point operations you can not expect (a+b)+c = a + (b+c)
``from algorithm import vectorizealias dtype=DType.float32alias SIMD_WIDTH = 2*simdwidthof[dtype]()alias NUM = 32fn main():    var v = DTypePointer[dtype]().alloc(NUM)    for i in range(NUM):        v[i] = i*0.2932        fn f1() -> Float32:        var val:Float32 = 0.0        for i in range(NUM):            val += v[i]        return val    fn f2() -> Float32:        var val:Float32 = 0.0        @parameter        fn _op[width: Int](iv: Int):            for j in range(width):                val += v[iv+j]        vectorize[_op, SIMD_WIDTH](size=NUM)               return val    fn f3() -> Float32:        var val:Float32 = 0.0        @parameter        fn _op[width: Int](iv: Int):            for j in range(width):                val += v[iv+width-j-1]        vectorize[_op, SIMD_WIDTH](size=NUM)               return val    fn f4() -> Float32:        var val:Float32 = 0.0        for i in range(NUM):            val += v[NUM-i-1]        return val    fn f5() -> Float32:        var val:Float32 = 0.0        @parameter        fn _op[width: Int](iv: Int):            val += v.load[width=width](iv).reduce_add[1]()        vectorize[_op, SIMD_WIDTH](size=NUM)               return val    print("f1:",f1())    print("f2:",f2(),"\n")    print("f3:",f3(),"\n")    print("f4:",f4())    print("f5:",f5())``
``from algorithm import vectorizealias dtype=DType.float32alias SIMD_WIDTH = 2*simdwidthof[dtype]()alias NUM = 32fn main():    var v = DTypePointer[dtype]().alloc(NUM)    for i in range(NUM):        v[i] = i*0.2932        fn f1() -> Float32:        var val:Float32 = 0.0        for i in range(NUM):            val += v[i]        return val    fn f2() -> Float32:        var val:Float32 = 0.0        @parameter        fn _op[width: Int](iv: Int):            for j in range(width):                val += v[iv+j]        vectorize[_op, SIMD_WIDTH](size=NUM)               return val    fn f3() -> Float32:        var val:Float32 = 0.0        @parameter        fn _op[width: Int](iv: Int):            for j in range(width):                val += v[iv+width-j-1]        vectorize[_op, SIMD_WIDTH](size=NUM)               return val    fn f4() -> Float32:        var val:Float32 = 0.0        for i in range(NUM):            val += v[NUM-i-1]        return val    fn f5() -> Float32:        var val:Float32 = 0.0        @parameter        fn _op[width: Int](iv: Int):            val += v.load[width=width](iv).reduce_add[1]()        vectorize[_op, SIMD_WIDTH](size=NUM)               return val    print("f1:",f1())    print("f2:",f2(),"\n")    print("f3:",f3(),"\n")    print("f4:",f4())    print("f5:",f5())``
output:
``f1: 145.42720031738281f2: 145.42720031738281 f3: 145.42721557617188 f4: 145.42718505859375f5: 145.42718505859375``
``f1: 145.42720031738281f2: 145.42720031738281 f3: 145.42721557617188 f4: 145.42718505859375f5: 145.42718505859375``
so we have three different results of operations which theoretically should get the same result. Now I wonder how to deal with this, these small drifts can have significant effects of course. (in my case with llm.mojo, it produces different texts than llm.c )
5 Replies
Good question! floating point numbers in Mojo already comply with IEEE 754-2008 that means they are not associative (as you pointed out) and it's the same for all programming languages that comply with such protocol. So when it comes down to comparing float values they shouldn't be compared as exact values and use approximate comparison with a tolerance. You can use `assert_almost_equal` https://docs.modular.com/mojo/stdlib/testing/testing#assert_almost_equal
testing | Modular Docs
Implements various testing utils.
If interested there's another aspect of floats like `0.1 + 0.2 != 0.3`. Have a look at https://0.30000000000000004.com/ Also check out this important paper https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf
Martin Dudek4mo ago
Thanks Ehsaan. What surprises me in the demo is that f4 which sums up the number in reverse order gets the same result as the standard vectorized version f5. My expectation would have been that f1 leads to the same result. So it seems that vectorize internally reverses the calculation
Maxim4mo ago
It could be a coincidence based on the numbers (step) you chose. I think to be sure you need to perform multiple tests with random numbers.
Martin Dudek4mo ago
good point, you are right, with 42 the reverse order sum is same as the standard loop, and the vectorized one is different than all the others. Trying different numbers makes clear that there is no obvious rule when which version calculates the same result.