Author Topic: Md 68k and hu6280 comparison (Read 10264 times)

soop · « **Reply #90 on:** May 31, 2013, 02:43:55 AM »

Gah, this is turning from a really interesting thread into a dick measuring contest.

EvilEvoIX, please shut up, you're embarrassing yourself. This is like a child walking into a lab where two geneticists are comparing the genetics of rabbits and hares, and declaring "I like bunnies better because they're softer".

NOT THE POINT.

nodtveidt · « **Reply #91 on:** May 31, 2013, 03:28:34 AM »

EvilEvoIX, you don't know your ass from a hole in the ground. Here's a hint... one talks shit, the other is dug with a shovel. Come back when you have some actual coding and/or hardware experience, eh? In the meantime, thanks for the laughs, but I'm afraid you're just too stupid for this thread.

Tatsujin · « **Reply #92 on:** May 31, 2013, 03:36:21 AM »

Tatsujin · « **Reply #93 on:** May 31, 2013, 03:40:54 AM »

Opethian · « **Reply #94 on:** May 31, 2013, 03:46:22 AM »

all fart and no shit....

Arkhan · « **Reply #95 on:** May 31, 2013, 04:21:02 AM »

Quote from: EvilEvoIX on May 30, 2013, 06:02:29 PM

Read my quote above, some good info in there from someone smarter then us

That post was from me, who you previously disagreed with and insulted in this thread.

So, maybe you are bad at reading.

The whole thing boils down to 6280 is a worthy competitor, that is on par or better than the 68k (outside of lots of 16bit operations), but requires a bit more finesse to get it to perform that way.

Also, the truth of the matter is, we are discussing the MD's CPU. It's a nice 68000 in the sense that it's a 68000, but.. it's no 68060. That's for sure.

Bonknuts · « **Reply #96 on:** May 31, 2013, 06:15:09 AM »

Quote

The notion that the PCE CPU was entirely new was misleading as it is a revised version of the 65SC02 which was used in many other devices and computers. It was originally designed to compete against the Z80.

I'll give you props for using 65SC02 (also known as R65C02S and 65C02S). That was the last revision of the cpu (the one WDC was working on was abandoned; the 65CE02), and it was not done by WDC. Rockwell made a custom core with new instructions. It wasn't until later than WDC incorporated those instructions into the 65C02 core. The 6280 is branch of (back then at the time) and unofficial 65x core. The 6280 shares an interesting trait with the mitsubishi e740 (a super custom 65x modded core); the T flag. Though the T flag is much more usable on the e740 since it doesn't get cleared after every instruction. The 6280 wasn't meant to compete with the z80; the 2mhz 65c02 (not even the rockwell version needed) competed just fine. And the 6809 even more so. I think by the time the z80 got into 7-8mhz range, WDC was already moving onto the '816.

Anyway, do you understand what it means to 'revise' a hardwired cpu? The original 6502 design was fast because everything was hardware by hand. It was also fast and cheap, because it didn't have microcoded instructions (the z80 and 68k did). That means less transistors and faster instructions (not unlike the early RISC processors). Simply 'revising' the cpu, means adding new instructions by *hand*. And the fact that you need an engineer capable in this field (processors, specifically). And I'm completely disregarding the sound IC that's on the same cpu chip; that's just a consolidation into a single package to save on cost (not the same thing as adding new support and instructions). A lot was added to the r65c02s core to make the 6280 (including little more relaxed bus timings). A hell of a lot more expensive than just simple using an off the shelf processor (they also had to have these fabricated as well, which means more cost as well).

I do know the 68000 wasn't cheap in the 80's. And motorola didn't provide a freely licenseable core until some time in the 90's (for embedding into packages). The cost of the 6280 (they had to license the original core, higher an engineer/team to build this thing, and also fabricate it) was probably close if not on par to the cost of the 68k. If you put a 68k in the PCE, nothing would really change. And I've already stated that a 68k would actually be slower for hsync effects due to it's interrupt latency. What you get with a 6280, is a very fast processor at 7.16mhz (no other 65x ran that fast at the time) that put it in the league of higher end machines (arcades/computers) but with an instruction set that was familiar to anyone that coded on the 65x; mainly the dominating Famicom at the time. Developers could port game code very easily, both forwards and backwards. Or just transition to the new platform with little downtime. The power of the 16bit arcade systems, the familiarity of the older well known 65x.

Quote from: touko on May 30, 2013, 09:16:51 PM

Code: [Select]
movem.l (a0)+,d0-d7/a2-a6 / 124 movem.l d0-d7/a2-a6,-(a0) / 120 add.l #56, %a0 / 14 258 cycles for 56 bytes, an average of 4.6 cycles / byte... or move.l (a0)+,(a1)+ / 20 you take 20 cycles for 4 bytes,an average of 5 cycles / byte, a transfer rate of 1.53 Mo/s.
It takes these values, regardless of the number to be copied, and makes a simple multiplication .

A jump table with a code list of "move.l (a0)+,(a1)+" would be the more practical of all of them. Though you could do the same with the one I provided for 4.62 cycle per byte as well, with a little more bloat to it (jump table and code list instead of a loop). Yeah, I don't see him getting a realistic 4.3 or 4.4cycle per byte transfer, unless they're fills.

I had done an object to object collision detection routine that I wrote for both the 68k and the 6280 (I was working on porting my code to the 68k MD at the time). The routine wasn't anything fancy (it was a simple X1,x2,y1,y2 compare check against another object). I had an object list that I would parse, if the object was active - I'd jump to the collision routine. The 68k and 6280 were about the same (the 6280 was either a few cycles faster or a few cycles slower), but the 68k one ended up being slower - because of the JSR/RTS overhead. I never understood why those two instructions where so slow. It's not like anything gets push or popped from the stack from these instructions.

Another interesting fact, though it doesn't really have any real barring on speed, was the MIPS of both processors. Not the max capable MIPS, but what games were doing in 1/60 frames. Exophase told me that his emulator calculated about ~1.8 MIPS for the average PCE game and that his friend that was doing an MD emulator said games tended to be about ~0.75 MIPS. That means the average instruction time for the PCE games were about 3.9 cycles (a little bit lower than I had predicted) and average instruction time on the MD was 10.22cycles. Just a reminder of how different these two processor architectures are.

spenoza · « **Reply #97 on:** May 31, 2013, 06:30:04 AM »

Quote from: touko on May 30, 2013, 09:48:01 PM

Look at this :

yes, this is on snes with his "crappy CPU", when this CPU is programmed by a master of 65xx, there is no slowdown, a lot of sprites on screen,
lot of action, no sprites flicking ..
Technicaly, this shoot is better than any Md ones .
Can you imagine what he would do with a CPU clocked at 7mhz ??
You can see also his last game on C64, enforcer :

Only with a 6510 @0,9 mhz, yes less THAN 1 MHZ ..

Hrm... That programmer loves putting lots of asteroids in his shooters : ) While both examples you list are technically impressive, from a fun standpoint I'm not really keen on shooting tons of asteroids for an entire level.

And what's really funny about both examples, and most of the back and forth here is that while the CPU is handling collision code, the actual blitting of BG and sprite objects is based on the VDP hardware and has a lot less to do with the CPU. Look at the Neo Geo vs the MD. Both are driven by the m68k CPU, but the Neo Geo had an insane amount of graphics and audio hardware in it. What made the Neo Geo special was not the CPU but all that custom hardware. It's ability to push sprite around and the flexible, if odd, audio setup was what defined the system, not the CPU.

Evo:
It's like Tom has said repeatedly throughout the thread, the M68k is a "modern" CPU design. It wasn't necessarily valued for speed so much as easy of development and general robustness. The arcade boards that used it were successful for the same reason the Neo Geo was successful: they had killer video and audio capabilities. The CPU just sat in the background doing collision detection and keeping the show running. It was the video and audio hardware that pushed everything around on-screen and kept the tunes jamming. I imagine the M68k was as popular as it was because the clock speed scaled up better than some of the older CPU designs and because they could hack out some C programming for the main loop and collision detection and not have to get their hands dirty with tons of assembly code optimization.

If anyone wants to claim the M68k is a better CPU than the 6280, go ahead, but do it for the right reasons. The M68k is easy to develop for, is good at multiplication, has good support for the kinds of features an OS-based system needs, and supports higher clock speeds than previous chips. But when you get down the the metal and start comparing memory access latency, the clock cycles needed to perform core integer maths, etc... the older 8-bit CPUs hold up really well. Their biggest flaw was simply that they could be a real pain to get good performance from. The 6280 is a very capable CPU, but it takes more knowledge and more work to extract performance from it. The M68k has lots of low-hanging fruit, programming-wise. The 6280 makes you work for the performance. I would contend that that's a strong argument for the M68k being a "better" CPU, for some definitions of better, but when you look at some of the more striking examples of good code, it is clear that the 6280 is no performance slouch and can hold its own.

Tom has already pointed out that M68k code doesn't really optimize dramatically, because C code is already pretty optimal. The 6280 has lots of tricks and weirdness that can be exploited for speed. So crap code by novice programmers (like I hope to be someday) is probably going to run better on the M68k, but optimized code by experienced programmers will probably look much more similar from a performance profile. And that's what this discussion is about. It's not about whether the Genesis or the PCE is a superior platform. This thread is about theoretical optimized performance levels between two specific CPUs in the hands of experienced and capable programmers, which you (Evo) are not. I also am not, but because I'm a CPU/hardware wonk I can interpret, minimally, the discussion going on in here.

Tom, if I've summarized your statements incorrectly, please feel free to rip me a new one. Anyone else can sit on a tack : ) I trust Tom not to be an ass when correcting me.

Bonknuts · « **Reply #98 on:** May 31, 2013, 06:35:34 AM »

Quote from: spenoza on May 31, 2013, 06:30:04 AM

Tom, if I've summarized your statements incorrectly, please feel free to rip me a new one. Anyone else can sit on a tack : ) I trust Tom not to be an ass when correcting me.

Dude, you were spot on.

Tatsujin · « **Reply #99 on:** May 31, 2013, 06:59:53 AM »

Lol, I have R2. It's a technically amazing game, especially consdering all the other stuff done on the SFC so far. Many Trenz is a coder legend.

Arkhan · « **Reply #100 on:** May 31, 2013, 07:10:13 AM »

To be fair, Tom's not the only one who has said that stuff, repeatedly. lol.

spenoza · « **Reply #101 on:** May 31, 2013, 07:19:55 AM »

Quote from: Arkhan on May 31, 2013, 07:10:13 AM

To be fair, Tom's not the only one who has said that stuff, repeatedly. lol.

True, you said some as well, and Touko, and Old Man, but Tom presented in the manner best structured and most easily digested by my brain.

Also, I wanted to have some specific reference points, and he's posted the most info in this thread.

Sorry if you guys feel dissed. I didn't mean to discount your contributions or knowledge. I just can parse Tom's stuff better mentally.

Arkhan · « **Reply #102 on:** May 31, 2013, 07:21:07 AM »

Right, but the question is, can the person the stuff is really directed at (Evo)?

Bonknuts · « **Reply #103 on:** May 31, 2013, 08:01:21 AM »

This is from a discussion from Steve Snake, me, Chilly Willy, Exophase. We were pretty much putting popular claims to the test.

This was a segment of example code Steve Snake wrote (note: before he made the famous Kega/Fusion emulator, he was a programmer for the MD and other platforms). It's a velocity update routine for an object (both X and Y directions):

Code: [Select]

68k:
4   lea address.w,a0    ;8/12
2   bsr                 ;18

2   move.l (a0)+,d0     ;12
2   add.l  d0,(a0)      ;20
2   move.l (a1)+,d0     ;12
2   add.l  d0,(a1)      ;20
2   rts                 ;18. 64+36=100+8=108(112)
16/7

Code: [Select]

  ;6280 object

2   ldx #$xx              ;2
3   jsr AddVelocity       ;7        

AddVelocity:    
3   lda x_float,x         ;5
1   clc                   ;2
3   adc <x_float_inc,x    ;4
3   sta x_float,x         ;5
3   lda x_whole.l,x       ;5
3   adc <x_whole_inc,x    ;4
3   sta x_whole.l,x       ;5
3   lda x_whole.h,x       ;5
2   adc #$00              ;2
3   sta x_whole.h,x       ;5 = 42
    
3   lda y_float,x         ;5
3   adc <y_float_inc,x    ;4
3   sta y_float,x         ;5
3   lda y_whole.l,x       ;5
3   adc <y_whole_inc,x    ;4
3   sta y_whole.l,x       ;5
3   lda y_whole.h,x       ;5
2   adc #$00              ;2
3   sta y_whole.h,x       ;5 = 40
    
1   rts                   ;7

62/22   
                        ; 82+14 = 96+2=98 (102)
                        
                        16.8 + 8.8 -> 16.8

These examples were trying to be in game logic context, but the prep part is actually unrealistic. I wouldn't be loading an immediate for X; it be from a object table (maybe adding 10 cycle or so more. The 68k one would be more than 10 cycles, for the same). But I did that because his (Steve Snake) fixed address for loading into A0 was a bit unrealistic as well (using LEA abs,a0 is basically a faster way to load an immediate into an address register than using move).

The 68k one is 108 cycle and the 6280 one is 98 cycles. While these aren't apples to apples straight comparison, relative to what needs to done/accomplished - I think they are directly comparable. The difference between the two are this: the 68k is using signed numbers (so you don't need to have four sets of routines) while the PCE version uses unsigned numbers and needs a jump table depending on one of the four directions the object is moving. The 68k one is using 32bit math; 16bit:16bit fixed point. So 16:16 + 16:16 -> 32bit. I consider this completely overkill. One, the whole number larger than 8bit mean you might not even see it move on screen (it could skip the screen entirely if aligned right); it's not needed. Two, 1/65535 of a pixel movement is overkill to me. Hell, even 1/256 is a little bit overkill. But... it's done out of reasons for convenience and speed.

So the 6280 one has a 16:8 (24bit) fixed point position for both X and Y. The scalar/speed is 8:8 (16bit). 24bit + 16bit -> 24bit. If I did a straight 32bit conversion of his code, then it'd be slower on the 6280. So it's adapted to what is needed, since the original is overkill. You could technically do an 8:8 fixed point position for X and Y (say for a clipped horizontal shootie or a vertical shootie) and speed it up, but I wanted a more realistic conversion of his code.

For reference, the '816 version was 80 cycles (not the SNES cpu version, since it has wait states on ram, so it's be closer to ~90 cycles) and used full 32bit variables like the 68k version (it was faster to do it that way on the '816 because of the lack of byte access opcodes).

Edit: There's nothing fancy or clever about my code, either. Sure, it uses split tables - but that's a given for any 65x array access that's larger than one byte width. No voodoo code there.

EvilEvoIX · « **Reply #104 on:** May 31, 2013, 08:23:36 AM »

Quote from: TheOldMan on May 30, 2013, 09:40:42 PM

Presumably, a faster clock speed would allow the chip to do more - but that's not always true (anyone else own a 486?)
Given equal clock speeds, I think the 6502 core would outperform the 68000, based on the cycle counts Tom posted.

You think or you know? Ask your wife again, maybe she knows. The issue once again you brilliantly danced around is that toe to toe, head to head, the M68K handles more than the hu6280. No need to get upset or bring your wife into this, just saying is all…

Quote from: touko on May 30, 2013, 09:48:01 PM

Quote from: EvilEvoIX on May 30, 2013, 09:34:23 PM
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Must be one of those vile Amiga Fan Bois tarting up the 68K again....

what's your problem , the fact than a 8 bit processor, can compete with 68k ??
There is no troll on it, it's only the reality of code ..

We just took the case of game consoles, not in general use .

Look at this : http://youtu.be/2liTcrxOESA
yes, this is on snes with his "crappy CPU", when this CPU is programmed by a master of 65xx, there is no slowdown, a lot of sprites on screen,
lot of action, no sprites flicking ..
Technicaly, this shoot is better than any Md ones .
Can you imagine what he would do with a CPU clocked at 7mhz ??
You can see also his last game on C64, enforcer :

Only with a 6510 @0,9 mhz, yes less THAN 1 MHZ ..

I got no problem with you, some people here think the M68K was over rated and some called this an act by Amiga lovers. I know the HU6280 can compete but as you posted earlier the 68K gets more things done doing 16bit operations, it’s just what it does.

Bonknuts seems to think you start propaganda since you like the Amiga, he has a real problem with you guys.

Quote from: Black Tiger on May 31, 2013, 01:50:42 AM

[. The only difference between you and EvilEvoX in discussions like this, is he championing the MD instead of SNES and is much more polite.

I'm going to go ahead and take this as a compliment in a sea of insults. Again I am not a MD fan boi but more of a M68K fan boi. The only reason I have so much Sega stuff as it is by far the cheapest to collect for. Games are like a dollar loose and the most I paid for an in the box is like $15. Turbo and Neo Stuff I pay the most for by far.

Quote from: soop on May 31, 2013, 02:43:55 AM

Gah, this is turning from a really interesting thread into a dick measuring contest.

EvilEvoIX, please shut up, you're embarrassing yourself. This is like a child walking into a lab where two geneticists are comparing the genetics of rabbits and hares, and declaring "I like bunnies better because they're softer".

NOT THE POINT.

Explain how when the title of this thread is " Md 68k and hu6280 comparison", and I post... "The M68K is faster and gets more done in less time". Is anything but the truth. Truth be told you all put a MD face on this but I coulda brought up the Sega CD or Neo Geo. The thread only asks about two specific chips and everyone assumed PCE Vs MD. Granted I go into rants about how a LOT of PCE games (To me for the love of god to me only in my opinion your results may vary) look 8-Bitish...

That said I just dumped some serious cash in the system and play it daily so who's to complain am I right?

Author Topic: Md 68k and hu6280 comparison (Read 10264 times)

soop

Re: Md 68k and hu6280 comparison

nodtveidt

Re: Md 68k and hu6280 comparison

Tatsujin

Re: Md 68k and hu6280 comparison

Tatsujin

Re: Md 68k and hu6280 comparison

Opethian

Re: Md 68k and hu6280 comparison

Arkhan

Re: Md 68k and hu6280 comparison

Bonknuts

Re: Md 68k and hu6280 comparison

spenoza

Re: Md 68k and hu6280 comparison

Bonknuts

Re: Md 68k and hu6280 comparison

Tatsujin

Re: Md 68k and hu6280 comparison

Arkhan

Re: Md 68k and hu6280 comparison

spenoza

Re: Md 68k and hu6280 comparison

Arkhan

Re: Md 68k and hu6280 comparison

Bonknuts

Re: Md 68k and hu6280 comparison

EvilEvoIX

Re: Md 68k and hu6280 comparison