by cersoft » Dec 8, 2002 @ 2:39am
Cheers for that info.
I have been doing some more tests concerning cache misses, my testbed uses the 'Simple' sample with the rotating ipaq removed, screen orientation is GDDISPMODE_ROTATE90CCW. My test data is an animating sprite sized 64x48 pixels with 14 frames.
I am drawing 200 sprites, each one cycling through the frames (sprite 0 = frame 0, sprite 1 = frame 1) so as to totally thrash the cache.
Using bltfast (ie. mfc sample) 59.4ms (iPaq 3970)
Using my own keycol blitter 59.1ms
Using keycol span skipper 51.3ms
Now this span skipper method just finds the portions of the sprites that are not keycol'd out (ie. not magenta), now what happens if we move all of the drawable span data into a single array of data with no gaps....
Using datacomp span skipper 45.1ms
Here we can see that by bunching all of the data into a single array we are not hitting the cache as badly and hence we get much better draw time.
Of course the problem is that we cannot bltfast this compressed data and it only works when drawing from the data row wise.
Still, quite an improvement. :P
Colin