Page 1 of 3

What's wrong with XScales after all?

PostPosted: Nov 11, 2002 @ 7:22pm
by ToTTenTranz
Why do the 400Mhz of the PXA250 seem so disappointing?!
How is it possible that a CPU running at 400Mhz and having MMX and SSE extensions actually get slower than another one running at 206Mhz?


Is it only a software flaw, like the lack of optimization for that cpu, or are there also hardware issues?

Even if not having many L1 cache and floating point units, the XScale #should# at least have the performance of a Celeron at 300Mhz, right?

But a celeron at that speed is perfectly able to play tomb raider and quake1 at decent (playable) framerates, emulate gameboy without slowdowns, play divx movies at 240*320 without glitches, emulate GBA, SNES, Genesis etc... »»» which my Pocket LOOX isn't capable of!

So what's wrong with XScales? Why aren't they twice as fast as StrongARMs? They should be even faster than that on multimedia applications because of the added extensions!


Another question:
Is there any hope yet? I mean.. will microsoft release like a new version of PocketPC or GAPI that will take full advantage of the XScale architecture?

Or will the XScale be no more than a StrongARM with less power consumption?

PostPosted: Nov 11, 2002 @ 9:06pm
by James S
The main issue with XScale is lack of OS optimization. We could see nearly a 50% increase in speed over 206MHz if optimizations were made.

But XScale does not have MMX or SSE extensions, atleast I'm pretty sure they don't. I don't know where you heard that... I do recall Intel saying something about adding MMX extensions to a new crop of portable processors, but that's ARMv11, not ARMv5 or ARMv6 (StrongARM and XScale, respectively).

There is a small hardware problem. Memory writes and moves are VERY slow, about a third of the speed of the older ARM processor. This can be worked around somewhat, but it is a major issue. Even with this, however, there is possiblity for XScale to run faster than it currently does, and slightly faster than it's older brother StrongARM.

Software optimizations are the major problem, though. The current batch of emulators are just so unoptimized that it's not even funny. They could easily, with optimizations, run on the SH3 processor at 133MHz. A lot of the things that are emulated run on ARM processors natively so it should be quite easy to make a lot of these emulators.
But the ARM processors are very slow at floating point operations, every mobile processor is. Floating point coprocessors use up a lot of power, and aren't really needed for the average PDA use so there's no point in having them.

The 400MHz XScale will never be twice as fast as a StrongARM at 206MHz. But it could potentially be faster, with OS optimizations and better hardware design. Microsoft has no plans to support XScale with PocketPC2002, and no plans for an update. The WindowsCE.NET OS has support for XScale, but that OS isn't available in PocketPC format and there is no known timetable for its release.
The XScale was designed with power consumption in mind, and performance was simply a consequence of this. The XScale IS the best mobile processor because of how well it works at low power. The XScale IS NOT a performance chip. It fills a gap that was desperately needed. Its performance is by no means lackluster, either, as it beats the 206MHz StrongARM processor in nearly every category.

PostPosted: Nov 11, 2002 @ 9:43pm
by R0B

PostPosted: Nov 11, 2002 @ 10:15pm
by Dan East

PostPosted: Nov 11, 2002 @ 10:15pm
by ToTTenTranz

PostPosted: Nov 11, 2002 @ 10:25pm
by James S

PostPosted: Nov 11, 2002 @ 10:32pm
by ToTTenTranz

PostPosted: Nov 11, 2002 @ 10:33pm
by R0B

PostPosted: Nov 12, 2002 @ 12:28am
by James S

PostPosted: Nov 13, 2002 @ 12:01am
by Deje
Dan, I am completely with you.
I don't believe it's just a software issue. I am sure the OS is involved, but does not play the major role.

Anyway, look for a similar discussion.

PostPosted: Nov 13, 2002 @ 12:05am
by sponge
If I'm not mistaken, doesn't the XScale perform poorly in multi-word load and storing on non-optimized apps?

PostPosted: Nov 13, 2002 @ 12:17am
by Deje

PostPosted: Nov 13, 2002 @ 12:47am
by Quantum

PostPosted: Nov 13, 2002 @ 3:28am
by Digby
Guys, I've done a lot of work in narrowing this down. There is something broken in the current rev of XScale chips with regard to memory access and I've written the benchmarks that prove it.

No OS mods, app tweaking, or new ARM5 compiler is going to fix this unless by chance you end up performing fewer read/writes to memory.

The 16-bit data path that people keep babbling about with the Toshiba e740 is the memory bus width to read/write the onboard ATI Imageon's video memory. The path to system memory is 32 bits, just as it is on the SA1110.

PostPosted: Nov 13, 2002 @ 3:55am
by sponge