Wikipedia: 'a CD-ROM sector contains 2,352 bytes of user data, composed of 98 frames, each consisting of 33-bytes'
That's just a case of too-much-information ... it's talking about the extremely-clever very-low-level scheme that the CD format uses to encode the data and provide error-correction on a piece of fast-spinning-plastic. Only burning programs (like ImgBurn) need to care about those low-level details.
A CD-AUDIO sector contains 2,352 bytes of audio (as 16-bit sample data).
A CD-ROM sector contains 2,048 bytes of whatever-you-like data, plus a whole bunch of extra error-correction data that only the burning programs need to bother with ... you never see anything more than the 2,048 bytes-per-sector on the PCE, and LBA (logical block address) addresses are calculated in 2,048-bye increments.
Later on, there came the CD-ROM-XA format used by the PS1 and others, but again, you don't need to care about that on the PCE.
I'm trying to understand what's going on here. Basically I did a tiny PCE CD example file for myself a while ago but I cheated, instead of trying to understand what goes on with the CD routines I just did a .org $4070 because someone told me it starts there and that's it.
Since I absolutely HATE to have a random magic file (ipl.bin) without knowing how it works, just told to include it, I'm trying to figure out a bunch of stuff here. One of the questions has to do with CDROM'ing in general.
The ".org $4070" is a HuC decision, and PCEAS automatically sets up the CD IPL for HuC programs.
There are no overrides for ASM developers (there probably should be).
As it is ... when you really start putting together a serious project, you'll soon outgrow pure-pceas and either modify the isolink.c source for your own needs, or write something a bit more sophisticated (if you're not already).
The PCE System Card itself doesn't really care where you load and run your program, you just have to follow the rules for formatting the IPL, which tells the BIOS where you want to load stuff.
The System Card loads the 2 IPL sectors at $2800-$37FF, and you can load/run your program as low as $3000 if you wish, with no problems.
The LoX games actually load their data at $2800, which works, but if you load lower than $3000, then your load MUST NOT overwrite $DFE0-$DFFF or the IPL code will crash.
You can load as low as $2680, and even $22D0 if you never use the System Card PSG Player, or Squirrel.
If you don't use the System Card PSG Player, then never overwrite ZP $E6 or $E7, because the System Card IRQ routines look at those to decide when to call the PSG code.
QUESTINOS:
2. Why does the IPL data start at offset 0x0800 in the data track of the game?
Because the IPL program code (and credits text) start at offset 0x0000.
0x0000 + 2048 -> 0x0800
The complete IPL is the first 2 sectors in the first .iso track.
3. I know that the string "PC Engine CD-ROM SYSTEM Copyright HUDSON SOFT / NEC Home Electronics,Ltd." is for trying to make copyright law work as their license enforcer (ayy lmao) but what else is checked before executing code?
There's a bunch of text at the beggining of the data track/ipl.bin () following by some "junk" bytes. I zeroed them out to test and all I got was a READ ERROR trying to execute it. Is that also checked or is there any relevant program code in it?
IIRC, that string isn't even checked.
But the content of almost-all of the 1st CD sector
is checked byte-for-byte to make sure that it matches the one stored in the System Card BIOS.
That 1st sector is loaded at $2800 and contains code that is executed to load your game-program, using the data that you set up in the IPL Data Block in the 2nd sector.
That's the copyright and license hook ... every CD game must include Hudson's copyrighted IPL code or else the System Card won't run it.
Now do you understand why you can just overwrite stuff in there?
4. The IPL 24-bit data pointer (x0800). The official doc simply states "IPLBLK H/M/L - Load start record no. of CD", is that related to the Min-Sec-Frame format I keep hearing about? How does that work? In my compiled mini sample the first byte of code is located exactly 4096 bytes relative to the start of the "track", and my IPLBLK pointer reads 00 00 02. 1 frame = 2048 bytes?
It's an LBA address ... i.e. the offset from the start of the .iso track in terms of CD-ROM (2048 byte) sectors.
5. The IPL "No. of Records to read" byte. It's set to 16 (0x10), but 16 what exactly? 16 8192 byte blocks? Why 128kb (out of 256 available?), I only specified ONE bank in my .asm file! How does the assembler know which bank goes where on a big project, which ones get loaded, etc?
16 * 2048 byte sectors -> 32KB.
Get used to that 2KB size, and LBA addresses, they're everywhere.
The IPL will ONLY load into the original 64KB of RAM (banks $80..$87).
You then have to check to see if you are running on the Super System Card and have more RAM, and do your own load of extra data into banks $68..$7F
6. Program LOAD/EXECUTE addresses. Why $4000, why $4070? Is this configurable? Also this implies that it's only a 8kb bank that gets loaded so how do I make sense of (5)?
It's a HuC thing. If you're programming in ASM, you can load/run wherever you like.
Same with the bank layout IPLMPR2..IPLMPR6, in ASM, you can set them to whatever you like.
EXCEPT, please note that that they're stored as values 0..7, corresponding to banks $80..$87.
7. Mostly useless but what's with the 6-byte space after the program title? Bonanza Bros. has "ASK" written on it, so I had to.
Errrr .... because! If you're looking at the docs, you'll see how much space is allocated for the name, and that there are those extra 6 bytes.
It's probably either a "licensed product code" or "licensed developer code".
It's not checked by the System Card.