 by Guest » Feb 11, 2003 @ 5:41pm
by Guest » Feb 11, 2003 @ 5:41pm 
			
			Yeah, precisely. In fact I do have three kinds of "subsections" for each blit - one for when destination and source are both either aligned or not aligned ( the fastest case) and two other for either only source or destination being aligned.
The inner loop does require some orring and shifting  (it can get ugly  like in the example below - a inner loop for source aligned non-keyed blit but in the end it is much faster).
blitNormalSourceAligned_octcopy:     
	ldmia	r1!,{r4-r11}
	
	strh	r4, [r0], #2  
		
	mov	lr,r5, lsl #16
	orr	r4,lr,r4,lsr #16
	mov	lr,r6,lsl #16
	orr	r5,lr,r5,lsr #16
	mov	lr,r7, lsl #16
	orr	r6,lr,r6,lsr #16
	mov	lr,r8,lsl #16
	orr	r7,lr,r7,lsr #16	
	mov	lr,r9, lsl #16
	orr	r8,lr,r8,lsr #16
	mov	lr,r10,lsl #16
	orr	r9,lr,r9,lsr #16	
	mov	lr,r11, lsl #16
	orr	r10,lr,r10,lsr #16
	
	stmia	r0!,{r4-r10}  
	mov	r11,r11,lsr #16	
	strh	r11,[r0],#2
        subs	r3,r3, #1
	bne	blitNormalSourceAligned_octcopy    
I am right now cleaning up my code ( basically translating from GAS ( gnu assembler) to EVC++ style assembly) and then I will send you the example blits working with the current GAPI interface (sometime by the end of this week.)