I'd rather go for a smaller ring buffer like 4k, even for a CD project, if decompressing to a port.
Sometimes I feel like Morhpeus in "The Matrix".
I can only say ...
To paraphrase ... "There Is No Ring Buffer".
The ring-buffer concept is an artifact of Haruhiko Okumura's need to find a way for his LZSS.EXE program to LZSS compress floppy-disk sized files within the confines of a 64KB-or-less computer.
There is no need for the concept within the actual LZSS algorithm, or any of the derivative compressors/decompressors that don't share the same limitations.
It is just a way to access the last 4KB of decompressed data ... the "window".
If you're decompressing directly into RAM, then you automatically have that "window" available to you without the need for any ring-buffer processing.
****************
My SWD code isn't either unique, or particularly smart.
I've followed the same kind of logic paths that lead to PuCrunch ... how to more-efficiently store the basic LZSS repeat-count & offset pairs.
You can just decompress directly into the destination buffer ... no extra memory is needed, and there's particularly no need for that 4KB ring-buffer.
The
only exception that I can see is when decompressing to VRAM/ADPCM RAM, and then, IMHO, it's trivially easy (and much faster) to decompress directly into RAM and then copy the data to VRAM afterwards.
Since the LZSS "window" is usually only 4KB anyway, it doesn't really hurt you much to just split your data-to-be-compressed into 8KB-maximum chunks, and to process them separately. That also means that the output will fit within a single bank if started at the beginning, or 2 banks if you want a "stream" of decompressed data.
That's the approach that Falcom take in Xanadu 1 and Xanadu 2 ... and I agree with them.
The LZSS window size has only a limited effect upon the compression (statistically), which is why longer offset codes are allocated to matches that are further away, and shorter offset codes are allocated to closer matches.
The question is whether you even bother processing RLE sequences at all.
Since an RLE sequence is both rather rare, and easily compressed within the regular LZ77 scheme, I decided "no".