Not sure I'm following your protest here. I just forwarded a solution BurntLasagna found for those limited cases where a low preamp level wasn't enough to solve the clipping problem.
Crazy or not, I'm guessing he tried several effects options in Audacity, but simply put, when he was finished he told me that a low preamp level wasn't enough in some cases and more fiddling around was needed. Now I was surprised at that myself because I thought we solved the problem with just low preamp/amplitude levels but yeah... He said the high-pass filter effect makes the clip sound a bit grainier, but it did the job to stop the nasty hardware clipping.
I'm not dismissing you or your advice ... what I am doing is trying to understand how that "solution" works from a mathematical point-of-view.
All ADPCM codecs just apply a simple (but different) set of steps to an input waveform to produce an output waveform.
I just don't understand
why adding a high-pass filter would effect the high-frequency peaks and troughs that seem to be the cause of the ADPCM overshoots that are causing the wrapping/overflow on the MSM5205 ... unless you're feeding the compressor a "slinky" wave that hasn't already been normalized.
If you can find a better solution for those rare cases, I'm all ears, his was but one by simply messing around in Audacity. That's it. I would see what Audacity exactly does when you use it, but I'm just stating what I know. You'll have to ask him exactly what values did the trick to fix these tougher cases.
My point is that what you've got is basically "empirical evidence" of how to practically solve the problem when using SOX, but you're relying on a set of steps that involves processes that degrade the quality of the end-result, and you're not addressing the root-cause of the actual problem.
My view in 2012 when I started work with BL was that SOX is a seasoned piece of software and David was one guy that knew less about the codec than all of the SOX team. The only reason he wrote his ADPCM codec back in 2004 was because we didn't know about SOX at the time, so he in effect wasted his time when there was already something out there that could encode/decode the format. If we knew about SOX in 2004, he would've done something else.
SOX is a
great piece of software, and I recommend it to anyone.
The problem here isn't with SOX, it's that the algorithm that SOX is implementing for OKI ADPCM is the correct one for the OKI MSM6585 and MSM5218 ... but that it gets the mathematics wrong for the way that the MSM5205 (in the PCE) work, and that causes the "slinky" wave when you use it for decompression, and it also causes the ugly clipping/overflow/distortion when you use it for compression.
The difference between the old (MSM5205) and new (MSM6585) codec is tiny (1 or 2 lines of C code), but the effect is substantial.
It's like putting diesel fuel into a gasoline engined car ... you're putting perfectly-good fuel into a perfectly-good car, but they're incompatible.
To take this to a mathematical level (assuming a waveform range of 0..4095) ...
Say that you have two samples in the waveform that you're compressing, 4000 and 4060.
ADPCM has an adaptive step-size that changes dynamically depending upon the previous samples, so let's imagine that the current step-size is 100. Note that you
must always add or subtract one-or-more steps ... there is no "zero-change".
To go from one sample to the next, the algorithm adds 100 to 4000 to get 4100, but 4100 is outside the 12-bit range of 0..4095.
Now SOX understands that the later OKI chips implement overflow-protection, and the chip recognizes this overflow and clamps the output to 4095. This is what is supposed to happen.
But the MSM5205 in the PC Engine doesn't implement overflow-protection, and so it actually wraps the result around to 4 (4100 & 4095), which is at the complete-opposite end of the range ... and that causes a "click".
The point is ... when you understand the fact that the MSM5205 hardware works like that, it is
always possible to avoid generating an ADPCM code that causes that "click". In this case, you just generate an ADPCM code to subtract 100 instead of adding it, and so you get 3900.
Now that's not a perfect result, you've introduced an "error" of 160 (4060-3900), but that's one heck of a lot better than letting it wrap, which gives you an error of 4056 (4060-4).
Anyway, I put together David's source code for you if you want a look should it be helpful for what you're gonna do. As I have all the raw, clean dubbing waves for Ys IV, I could reencode them again should you produce something superior for all potential PCE dubbing projects in the future, if, as you seem to think, SOX should never be used again.
Thanks! I'll take a look at what he's doing in there.
BTW ... SOX definitely has its place in the conversion process, just not in the final stage of conversion from a 16KHz .WAV into a 16KHz .VOC (unless you custom-compile a version with a modified algorithm).
<edits for typos and to hopefully make things clearer>