Heya,
If you can stand the thought of me shuffling your code right into GapiDraw then please upload it!
It's about time GapiDraw supported more image formats, and right now I really could use any help I can get (I am reading through the entire blit core assembler output two or three times more just to make sure every operation performs in an optimal way). I have actually rewritten basically every blit operation from scratch, and expanded each function as many times as possible to ensure optimization.
For example, the standard BltFast operation today has more than 20 different internal paths depending on what options are set (memcpy as block if no options and target is not windowed display and surfaces are of matching size, memcpy as rows if no options are set, 8 fast blt operations, 8 blt fx operations)... It takes a while to go through all this code, but in the end it will be as fast as it possible can get.
/Johan