Author Topic: 68k and HuC6280 comparison  (Read 1068 times)

Bonknuts

  • Hero Member
  • *****
  • Posts: 3292
68k and HuC6280 comparison
« on: June 03, 2013, 02:05:49 PM »
These two processor architectures are so polarizingly different, not to mention the 'bitness' aspect as they are defined (32bit,16bit, and 8bit). This is a discussion for a more detailed look between both CPU's, in relation to the MD and PCE, beyond general assumptions and popular/common beliefs. This is not a system vs system compare. If the cpu's offer an advantage to the supported/specific hardware, than that's valid. Talk and examples of similar CPUs that share either of the two's architecture is also valid. And while game examples are valid, I'd rather it not be the primary source of comparison. Maybe more to reiterate a point or such. The primary discussion should focus more on code related examples, if possible.

 So, 68000 vs huc6280.


Please, no trolling. And if the trolls do come a knocking, please don't feed them. Best to ignore them.

touko

  • Hero Member
  • *****
  • Posts: 953
Re: 68k and HuC6280 comparison
« Reply #1 on: June 03, 2013, 08:15:41 PM »
A strange thing, 68K coders often do not seem to take into account the registers init in her comparison..
I know they are used to working with registers, but values ​​do not appear by magic..
« Last Edit: June 03, 2013, 11:11:11 PM by touko »

Tatsujin

  • Hero Member
  • *****
  • Posts: 12311
Re: 68k and HuC6280 comparison
« Reply #2 on: June 03, 2013, 09:17:15 PM »
Maybe bit OT, but are there also other CPUs that were often used during the 16-bit era?
Like in arcade hardware etc.
www.pcedaisakusen.net
the home of your individual PC Engine collection!!
PCE Games coundown: 690/737 (47 to go or 93.6% clear)
PCE Shmups countdown: 111/111 (all clear!!)
Sega does what Nintendon't, but only NEC does better than both together!^^

touko

  • Hero Member
  • *****
  • Posts: 953
Re: 68k and HuC6280 comparison
« Reply #3 on: June 03, 2013, 10:49:17 PM »
I don't know if AMR2 can be considered as belonging to 16 bit era !!

soop

  • Hero Member
  • *****
  • Posts: 2828
Re: 68k and HuC6280 comparison
« Reply #4 on: June 03, 2013, 11:01:40 PM »
A strange thing, 68K coders often do not seem to take into account the registers init in her comaprison..
I know they are used to working with registers, but values ​​do not appear by magic..

Isn't that address registers rather than data registers? Caveat; the only 68k I coded assembler for was the 68020.  Many years back.

touko

  • Hero Member
  • *****
  • Posts: 953
Re: 68k and HuC6280 comparison
« Reply #5 on: June 03, 2013, 11:15:35 PM »
Isn't that address registers rather than data registers? Caveat; the only 68k I coded assembler for was the 68020.  Many years back.
Same, because addresses must be loaded in register before use .
I think for this era, the big advantage of 6280 over 68k is really his 8 bit architecture, and fast ram read/write .
with this you can easily improve 16bit treatment, by changing only LSB or MSB .

For exemple: adding 256 bytes to a 16bit variable .
On 6280 you can do :
inc var + 1  ; // 5 cycle max, 4 cycles min (if var is in zero page), it's more efficient than 68k .
In a game this case is very common .
« Last Edit: June 03, 2013, 11:23:18 PM by touko »

touko

  • Hero Member
  • *****
  • Posts: 953
Re: 68k and HuC6280 comparison
« Reply #6 on: June 04, 2013, 10:28:50 PM »
A strange thing, why so many 6280 instructions have their cycles higher than 65xxx ??
I don't understand why at least zero page access are not the same  :-k..

ccovell

  • Hero Member
  • *****
  • Posts: 2245
Re: 68k and HuC6280 comparison
« Reply #7 on: June 04, 2013, 11:20:23 PM »
Possibly because 6280 Stack and ZP go through an extra layer of indirection by the memory mapper...?

touko

  • Hero Member
  • *****
  • Posts: 953
Re: 68k and HuC6280 comparison
« Reply #8 on: June 05, 2013, 12:41:29 AM »
aaah, may be ..
Pass through MMU may cause cycles penalties,seems logical to me  ..

Chris i know to be a father is a difficult task, but would you be interested in doing a bad apple demo on PCE ??  :mrgreen:
« Last Edit: June 05, 2013, 12:45:27 AM by touko »

Punch

  • Hero Member
  • *****
  • Posts: 3278
Re: 68k and HuC6280 comparison
« Reply #9 on: June 05, 2013, 03:27:22 PM »
Why not make something like HuVideo, fullscreen instead?

By the way, how many tiles can you upload from ROM/Syscard RAM to the video processor's internal memory per VBlank? Is it theoretically possible to create a "system card" with special hardware to generate a fullscreen video frame on it's RAM to be uploaded?

Say that there's a 3D processor or something that would calculate matrices, vectors, etc. and generate a fullscreen render, to be uploaded to VRAM every frame... is it possible to do it fullscreen or only a part of the screen would be possible due to bulk transfer speeds (tix instructions)?

Bonknuts

  • Hero Member
  • *****
  • Posts: 3292
Re: 68k and HuC6280 comparison
« Reply #10 on: June 05, 2013, 04:38:03 PM »
Why not make something like HuVideo, fullscreen instead?

By the way, how many tiles can you upload from ROM/Syscard RAM to the video processor's internal memory per VBlank? Is it theoretically possible to create a "system card" with special hardware to generate a fullscreen video frame on it's RAM to be uploaded?

Say that there's a 3D processor or something that would calculate matrices, vectors, etc. and generate a fullscreen render, to be uploaded to VRAM every frame... is it possible to do it fullscreen or only a part of the screen would be possible due to bulk transfer speeds (tix instructions)?


 If you embedded the graphics as ST1/ST2 instructions, it's the fast method to transfer to vram. Normally, TIA is 6 cycles per byte but since it's in hardware bank first 2k address space - it gets the +1 cycle penalty per memory access (it's a mystery, because the VDC never asserts RDY for that access. It does assert /RDY, but that's fir memory slot alignments during active display and it's only a fraction of the cpu cycle (/rdy works in master clock cycle steps, not cpu cycles)). So TIA to the VDC is 7 cycles a byte. ST1/ST2 is 5 cycles a byte (4+1penalty=5). There is a ~119436cpu cycles in a 1/60 frame. So the "theoretical" max transfer is 23.887k per frame. Not enough to do 60fps FULL screen, but definitely enough to do 30fps full screen. Of course, you can compress or save 5 cycles out of 10, in a word transfer if the LSB is the same as the previously LSB of the previous WORD (VDC latch trick, just use a ST2 and only write the MSB/latch, old LSB value is still kept). So technically, it's higher than 24k per frame. All depends on redundant LSBs.

You could do a hardware support that would basically insert such every other byte as those opcodes (switch between them). Though I've done it without hardware. I made a transparency demo that used them: http://www.pcedev.net/demos/transparency/test0.zip .

 The transparency is realtime (Amiga planar style). That is to say, it uses the VDCs 4bit planar mode to do hardware assist transparency effects. The demo was never polished and finished, but you can still see the results. I had this crazy idea; instead of using fixed point entries as code lists with RTS instruction for the embedded video in st1/st2 opcodes, one would use the hsync interrupt or the TIMER interrupt to set a time limit to the transfer. You push the return address on a special stack or place in ram, jump to the address and let the cpu run hog wild. When the interrupt counter completes, you manually change the real stack with the special/saved address and return to that instead. Of course, you'll need a buffer in vram for over spill since neither the hsync interrupt or timer interrupt is fine enough cycle wise to do accurate stopping on an exact instruction. Crazy? Yup, but it's totally doable.

 Anyway, the reason why huvideo is not full screen is because of the CDROM hardware. I mean, sure, the transfer rate from the CDROM itself is limiting - but the interface to the CDROM is very limiting. The cpu much consistently poll ports and such. Bytes have to be manually copied (and not at full speed. You can't read out a byte from a selected sector faster than ~24 cycles). There's not a lot of free time and what little there is, is spent either sending audio packets to ADPCM memory port (which is pretty damn slow, but at least you can write to ADPCM memory while it's playing) or writing frame updates to the VDC. I guess if you want low res full screen, you could use a hsync interrupt to double up the scanline to repeat every other scanline (an early version of huvideo used in john madden does this for the opening stadium intro).
« Last Edit: June 05, 2013, 04:44:30 PM by Bonknuts »

Bonknuts

  • Hero Member
  • *****
  • Posts: 3292
Re: 68k and HuC6280 comparison
« Reply #11 on: June 05, 2013, 04:48:14 PM »
aaah, may be ..
Pass through MMU may cause cycles penalties,seems logical to me  ..

Chris i know to be a father is a difficult task, but would you be interested in doing a bad apple demo on PCE ??  :mrgreen:

 The 6502 and 65c02 have penalty cycles for crossing the 256byte boundary. If an instruction operand lands on the start of a new boundary, there's a penalty cycle. Indexing might have one as well. On the 6280, there are no penalty cycles. But instead, all cycles are included into the opcode decoding. Kinda sucks because you can make some real nice optimizations on the 65x, but it also simplifies cycle counting on the 6280. This used to bother me a little bit, but then I realized that the clock speed on the 6280 is greater than any 65x. It's not like you're going to be comparing to a 7.16mhz 65x anyway. But yeah...

Quote
Possibly because 6280 Stack and ZP go through an extra layer of indirection by the memory mapper...?
LDA abs,x and such instructions also have 1 extra cycle than the 65x and that ABS location can be anywhere in the cpu 16 logical address range. The '816 has cycle penalties if the use the DP register to point to a non default bank, but the 6280 doesn't seem to care. I've mapped all of system card ram to $2000 and tested and found no difference. Though in my NES2PCE stuffs, I did find something interesting. I mirror ram to $0000 range as well. If you use the base 64k of CDRAM for this, it will corrupt itself. If you use the $f7 bank or any of the 192k SCD ram, then it's completely fine. Even my Duo and SuperCDROM^2 with built in system card, had this issue. Weird stuffs.
« Last Edit: June 05, 2013, 04:54:23 PM by Bonknuts »

touko

  • Hero Member
  • *****
  • Posts: 953
Re: 68k and HuC6280 comparison
« Reply #12 on: June 05, 2013, 08:22:57 PM »
Tom have i already said, that i like so much yours explanations ?? ;-) ..

And this transparent effect is awesome ..
Some opcode like smb/rmb have 2 more cycles on 6280 .

For use ST1/ST2 to transfering data, you need self modifying code, no ???
And you lose the benefit of ST0/ST1 transfer(may be i have not understood the trick) !!

I have red a topic where you and charles discussed about packed sprite, if i'am right !!
And one of you talked about this technique .
« Last Edit: June 05, 2013, 09:09:28 PM by touko »

Bonknuts

  • Hero Member
  • *****
  • Posts: 3292
Re: 68k and HuC6280 comparison
« Reply #13 on: June 11, 2013, 02:52:46 PM »

Some opcode like smb/rmb have 2 more cycles on 6280 .


 Probably because of the address decoding and included page penalty cycle. 
Quote
For use ST1/ST2 to transferring data, you need self modifying code, no ???
And you lose the benefit of ST0/ST1 transfer(may be i have not understood the trick) !!
I didn't use self modifying code, but you could (if you have enough ram; 8k isn't enough). My demo is hardcoded/embedded. I wrote a util that took data files and created embedded ASM files for use with PCEAS. But I've done a lot more than just transparency with that method.

 

Quote
I have red a topic where you and charles discussed about packed sprite, if i'am right !!
And one of you talked about this technique .

 Ehh? I don't remember. Charles and I talked about many awesome things, but I tend to forget things if I don't save them :/

BigusSchmuck

  • Hero Member
  • *****
  • Posts: 3425
Re: 68k and HuC6280 comparison
« Reply #14 on: June 11, 2013, 04:23:23 PM »
Pardon if this has been asked before, but how hard would it be to port a 68k game to HUC6280 and vice versa?