This site is no longer active and is available for archival purposes only. Registration and login is disabled.

CGapiSurface::BltFast


Postby Johan » Mar 6, 2003 @ 1:24am

Unfortunately my company owns the full rights to the source code. It was my only choice to keep working on GapiDraw - working on late evenings and weekends suddenly became too much work...

There has been a slight delay in releases lately (the past 6 months actually), but things will improve. I really appreciate the work done by warmi and will see if I can integrate it into the core..
User avatar
Johan
pm Member
 
Posts: 1843
Joined: Jan 12, 2002 @ 12:38pm
Location: Sweden


GapiDraw

Postby warmi » Mar 6, 2003 @ 1:30am

In this case there won't be really much to gain by incorporating the asm code.

The C code is already very fast and definitely much cleaner.

I think what he was talking about is having people looking over the source code could only benefit the project but since as you mentioned this code is owned by your company there is not much that can be done here.
In any case, keep up the good work - GapiDraw is already the best Game API for the PocketPC devices.
warmi
pm Insider
 
Posts: 518
Joined: Aug 24, 2002 @ 8:07am
Location: Chicago USA


GapiDraw

Postby Sm!rk » Mar 6, 2003 @ 11:07pm

User avatar
Sm!rk
pm Member
 
Posts: 172
Joined: Dec 16, 2002 @ 4:40pm


Postby Johan » Apr 22, 2003 @ 3:03pm

Just thought I should mention that I haven't given up on this topic yet...

Performance right now: 262 sprites each frame (on the IPAQ)... And I still have a few tricks up my sleeve.. Everything is still 100% C++. The performance difference to Walter's ASM code is less than 11%. GapiDraw also maintains this performance increase with all flags such as opacity included.

Image
User avatar
Johan
pm Member
 
Posts: 1843
Joined: Jan 12, 2002 @ 12:38pm
Location: Sweden


Fast Blit

Postby warmi » Apr 22, 2003 @ 3:12pm

Hehe.

Well, I am not goin to stand still - I am working in creating RLE encoded version. :-)

Basically now skipping over transparent areas , in the best possible case, is done on 2 pixel basis ( entire register of data is loaded and compared with value of two transparent pixels.)
but with RLE that could be extended to any number of pixels. On some sprites that would result in superior performance.
But I am sure you are working on that as well :-)
warmi
pm Insider
 
Posts: 518
Joined: Aug 24, 2002 @ 8:07am
Location: Chicago USA


Postby Johan » Apr 22, 2003 @ 3:24pm

:wink:

I sure am.. :) My first implementation was an ugly one however, than only compacted using "for each row, create RLE-strip from left to right". Adding display rotations to the equation resulted in non-cache-optimized writes.. I am now rewriting the compression code for the RLE strips to always go in the same direction as the display.. Should be interesting...

I think I'll create a small GUI overlay to use as a demo application.. I'll post it later this week so we can both try out our code with it (I guess you'll continue to stick with ASM, and the performance difference could be cool to evaluate)...
User avatar
Johan
pm Member
 
Posts: 1843
Joined: Jan 12, 2002 @ 12:38pm
Location: Sweden


warmi

Postby warmi » Apr 22, 2003 @ 5:02pm

Actually, I am only using my ASM on another platform.
I continue to stick with GapiDraw for the PocketPC code - I have entire framework built on top of GapiDraw + I don't have the code to handle all rotations/ workarounds for Gapi bugs etc ...
So , in the end it is easier for me to use GapiDraw.
warmi
pm Insider
 
Posts: 518
Joined: Aug 24, 2002 @ 8:07am
Location: Chicago USA


Postby Johan » Apr 23, 2003 @ 12:35am

warmi: HAH! 270 sprites.. :D In plain c++ (well, template meta c++, but almost)!! And that includes all the verifications etc I do for each coordinate + more... I would say that the performance improvement is up to 40% (if both surfaces are aligned) compared to previous code.

Right now I also have three subloops:
(1) both surfaces are aligned or unaligned
Read source as DWORD. check source + each pixel for mask. copy one pixel as WORD or both pixels as DWORD depending on mask.
NOTE: If opacity is enabled, and both pixels should be copied, a real fast 32-bit 3-multiplications-for-two-pixels alpha blend will be performed (or a quick 50/50 if opacity is set to 128)
(2) destination is unaligned
Read source as DWORD. check source + each pixel for mask. copy each pixel as WORD.
(3) source is unaligned
Read source as WORD. check each pixel for mask. copy each pixel as WORD. write destination as DWORD.

The performance is around 7% slower than the optimized ASM routines posted earlier. Right now I am kind of satisfied with the performance. I have boosted the AlphaBltFast as well, and will do some benchmarks tomorrow...

Image
User avatar
Johan
pm Member
 
Posts: 1843
Joined: Jan 12, 2002 @ 12:38pm
Location: Sweden


Previous

Return to GapiDraw


Sort


Forum Description

The Cross-platform Graphics SDK for Palms, Pocket PCs, Symbian Devices, and Stationary PCs.

Moderators:

sponge, Johan

Forum permissions

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

cron