That change assumes that surface->stride is measured in pixels, but it's measured in bytes. For example, 8-bit to 8-bit surfaces:
if ((surface->bpp == 8) && (src.surface->bpp == 8))
{
uint8_t *srcptr=(uint8_t*)src.surface->data;
uint8_t *dstptr=(uint8_t*)surface->data;
...
for (int y = area.height(); y != 0; --y)
{
...
srcptr += src.surface->stride;
dstptr += surface->stride;
}
And for 32-bit to 32-bit surfaces, srcptr and dstptr are converted to byte pointers before adding the strides:
else if ((surface->bpp == 32) && (src.surface->bpp==32))
{
uint32_t *srcptr=(uint32_t*)src.surface->data;
uint32_t *dstptr=(uint32_t*)surface->data;
...
for (int y = area.height(); y != 0; --y)
{
...
srcptr = (uint32_t*)((uint8_t*)srcptr + src.surface->stride);
dstptr = (uint32_t*)((uint8_t*)dstptr + surface->stride);
}
So, what's needed is to answer whether the threshold measure is bytes or pixels, and to do the arithmetic accordingly. A comment somewhere to say what the threshold units are would probably be a good idea, too.
For blitAlphaTest and particularly blitAlphaBlend, it's not at all clear a priori whether the measure should be bytes or pixels.
In the case of the software implementation of blitAlphaTest, there is an if/then/else in the inner loop, and for the case of blitAlphaBlend, there are 4 multiplies, 8 add/subtracts and 4 shifts in the inner loop. There can also be a colour lookup table access in the inner loop for 8-bit source -> 32-bit dest operations.
The blit acceleration threshold and its units probably need to be set by experimentation.
One of the changes in the Beyonwiz firmware in this area was to completely avoid acceleration for blitAlphaTest for the Beyonwiz U4 (--with-boxtype=et1300), because its blit hardware acceleration implements blitAlphaBlend when blitAlphaTest is requested.