Black Tiger had made the comment that if hucards had enough storage, they could have used streaming voices for cinemas using ADPCM and 10bit paired channel output.
That got me thinking, is that really feasible? And the answer is; yes, yes it is. I put out a demo playing two songs. One was 20khz and the other was 33khz, but neither was interrupt driven. So it got me thinking, what kind of acceptable ADPCM playback can I get from timed interrupts? What kind of resource am I looking at? Storage-wise, ADPCM is 4bit per sample. 4bits for a 13bit output is pretty decent IMO (clipped to 10bit for the paired channels).
The mednafen authour wrote the decompressor, and I’ve modified it slightly with a few case optimizations, but otherwise it’s pretty fast. So for 15.3khz (not 15.7khz) output, I’m looking at 50-55% cpu resource. And that’s the normal, non self-modifying, code version. Everything is contained within the VDC interrupt routine, so it’s self managing. That’s always nice because the other option is buffer fill and buffer read, and that gets tricky with timing.
So this soft playback ADPCM streaming sounds great at 20khz, but what does it sound like at 15khz? Hopefully pretty decent. From what I’ve heard in comparison to ADPCM on the CD unit itself, this soft playback routine seems to sound better. It might have to do with how the original ADPCM chip in the PCE CD unit is 10bit output too, but it can clip and overflow rather than saturate into positive or negative amplitudes (i.e. does it clip at 10bit, or 12bit but output 10bit?). Or maybe it’s something else, as in a filtering effect of the PCE audio circuit compared to the ADPCM output circuit of the CD unit.
Typically, CD games use 8khz ADPCM output for sound FX, and sometimes streaming.
So where is all this going? Well, I have a SF2 mapper and a flash card.. and if I reserve 2048k just for streaming audio, I can do a small demo (shmup) with streaming music. I only have 274seconds to work with, if I reserve the lower 512k for the game/demo itself. 274 seconds isn’t a lot, but I can loop tracks. At a minimum, I would need two level tracks and a boss track. Optimally though, I would want a fourth ending track. So something like three 70second tracks and one 64second track. Or whatever. How it’s divided up isn’t really an issue.
I spent yesterday reworking the ADPCM routine into a VDC interrupt routine. I also picked out two levels from two other shooter games of other consoles. The demo is going to be a simple vertical shmup/shooter. I was toying with the idea of the canyon level of Musha, and the 3D fire level of Axelay, with the Axelay level proceeding the Musha level (kinda makes sense). The graphics won’t be exact, but the effects will be similar. I plan to rip other enemy sprites from verty shmups too, and probably do a different boss for the Musha stage. I have 512k to work with for graphic assets. For both the Axelay 3D level and the Musha canyon stage, I spent quite a bit of time doing calculations for effects as well as redesigning the approach to those effects (with 60fps in mind). It’ll be kinda tight, but I’ve worked with worse.
As for the PCM engines, I did some work on those as well. The first XM player is done and I’ll probably release a very simply demo for it, and then one with a song demo afterwards.
But back to Black Tigers ponderings, if you did 7khz ADPCM for voice then that’s 3.5k per second. If you reserved 512k of rom for ADPCM, that gives 150seconds of speech or audio. If you used PSG/chip for music and some sound FX, you could easily put together cinema audio tracks. The silence between speaking or other audio parts, doesn’t need to be stored. Cinemas don’t take a whole lot of resource; I could even do realtime linear interpolation for that 7khz on a 15khz output.
But all this talk about compression, makes me wonder how some other compression schemes out there would sound. Maybe something less cpu resource than ADPCM. Something like range encoding delta PCM via block segments (kinda like the snes).