Integer division performance

I have created a small test to investigate integer division:

class Main {
    static function main() {
        var total = 0;
        for (i in 0...1000) {
            total += Std.int(i/3 + i*i/7);
        }
        trace(total);
    }
}

And then compile it to cpp.The generated code for the integer division was:

total = (total + ::Std_obj::_hx_int(((( (Float)(i) ) / ( (Float)(3) )) + (( (Float)((i * i)) ) / ( (Float)(7) )))));

This is something very unoptimal (cast to float and cast back to int), and I thought that enclosing the whole expression into Std.int would prevent such castings (as stated in the documentation).

I did the same with hashlink converted to c code, it has the same result:

	r3 = r2;
	++r2;
	r6 = (double)r3;
	r7 = 3.;
	r6 = r6 / r7;
	r5 = r3 * r3;
	r7 = (double)r5;
	r8 = 7.;
	r7 = r7 / r8;
	r6 = r6 + r7;
	r5 = (int)r6;
	r4 = r0 + r5;
	r0 = r4;

Is there a way to get a simple and proper integer division? Note the integer multiplication is correct, and there’s no casting involved.

You might need to do a Std.int per division for hxcpp’s optimization to happen.
total += Std.int(i/3) + i*Std.int(i/7);
Otherwise that’s not integer division but two float div which later are cast to int.

Thanks Valentin.

So I tried you solution, and with the cpp output, the issue is still there:

int total1 = ::Std_obj::_hx_int((( (Float)(i) ) / ( (Float)(3) )));
total = (total + (total1 + (i * ::Std_obj::_hx_int((( (Float)(i) ) / ( (Float)(7) ))))));

However, with hashlink converted to c I got the optimisation properly applied (all registers r0 to r8 are int):

	r3 = r2;
	++r2;
	r6 = 3;
	r5 = r6 == 0 ? 0 : r3 / r6;
	r8 = 7;
	r7 = r8 == 0 ? 0 : r3 / r8;
	r6 = r3 * r7;
	r5 = r5 + r6;
	r4 = r0 + r5;
	r0 = r4;

Note that there’s still room for optimisation at line 4, the comparison of r6 with 0 is useless, as r6 was set to 3 the line before. But I guess that the c compiler optimizer will catch it and remove it.

Using cpp.NativeMath.idiv seems to trigger the optimisation for hxcpp. Using Std.int seems like it should as the idiv function is just an inline function which wraps the division in Std.int.

Edit: Read the file wrong, the wrapping in Std.int only occurs when used on non cpp platforms, otherwise the extern functions is used.

Note that the Std.int(Int / Int) pattern cannot be optimized like that because of NaN-semantics. This would require some sort of fast-math switch.