Why not make something like HuVideo, fullscreen instead?
By the way, how many tiles can you upload from ROM/Syscard RAM to the video processor's internal memory per VBlank? Is it theoretically possible to create a "system card" with special hardware to generate a fullscreen video frame on it's RAM to be uploaded?
Say that there's a 3D processor or something that would calculate matrices, vectors, etc. and generate a fullscreen render, to be uploaded to VRAM every frame... is it possible to do it fullscreen or only a part of the screen would be possible due to bulk transfer speeds (tix instructions)?
If you embedded the graphics as ST1/ST2 instructions, it's the fast method to transfer to vram. Normally, TIA is 6 cycles per byte but since it's in hardware bank first 2k address space - it gets the +1 cycle penalty per memory access (it's a mystery, because the VDC never asserts RDY for that access. It does assert /RDY, but that's fir memory slot alignments during active display and it's only a fraction of the cpu cycle (/rdy works in master clock cycle steps, not cpu cycles)). So TIA to the VDC is 7 cycles a byte. ST1/ST2 is 5 cycles a byte (4+1penalty=5). There is a ~119436cpu cycles in a 1/60 frame. So the "theoretical" max transfer is 23.887k per frame. Not enough to do 60fps FULL screen, but definitely enough to do 30fps full screen. Of course, you can compress or save 5 cycles out of 10, in a word transfer if the LSB is the same as the previously LSB of the previous WORD (VDC latch trick, just use a ST2 and only write the MSB/latch, old LSB value is still kept). So technically, it's higher than 24k per frame. All depends on redundant LSBs.
You
could do a hardware support that would basically insert such every other byte as those opcodes (switch between them). Though I've done it without hardware. I made a transparency demo that used them:
http://www.pcedev.net/demos/transparency/test0.zip .
The transparency is realtime (Amiga planar style). That is to say, it uses the VDCs 4bit planar mode to do hardware assist transparency effects. The demo was never polished and finished, but you can still see the results. I had this crazy idea; instead of using fixed point entries as code lists with RTS instruction for the embedded video in st1/st2 opcodes, one would use the hsync interrupt or the TIMER interrupt to set a time limit to the transfer. You push the return address on a special stack or place in ram, jump to the address and let the cpu run hog wild. When the interrupt counter completes, you manually change the real stack with the special/saved address and return to that instead. Of course, you'll need a buffer in vram for over spill since neither the hsync interrupt or timer interrupt is fine enough cycle wise to do accurate stopping on an exact instruction. Crazy? Yup, but it's totally doable.
Anyway, the reason why huvideo is not full screen is because of the CDROM hardware. I mean, sure, the transfer rate from the CDROM itself is limiting - but the interface to the CDROM is very limiting. The cpu much consistently poll ports and such. Bytes have to be manually copied (and not at full speed. You can't read out a byte from a selected sector faster than ~24 cycles). There's not a lot of free time and what little there is, is spent either sending audio packets to ADPCM memory port (which is pretty damn slow, but at least you can write to ADPCM memory while it's playing) or writing frame updates to the VDC. I guess if you want low res full screen, you could use a hsync interrupt to double up the scanline to repeat every other scanline (an early version of huvideo used in john madden does this for the opening stadium intro).