Emulator audio differences..

In light of recent tests, I decided to take a closer look at different emulator outputs – relative to the real system.

There are a few major things an emulator needs to get right:
– Volume levels. Doubt any emulator is using linear volume levels, but it is important to get the “correct” non linear value system.
– Sampling of the audio reg writes. In a perfect world, every write to an audio reg would register immediately. After all, it does on the real system.
– How to handle Nyquist frequency of down sampling.

From the tests I’ve done:
Volume. Mednafen, Blargg Music player, Yame, TE, and Ootake seem to get this right. Mednafen, Blargg player, and TE definitely do. Magic Engine has some issues (as well as do a lot of other 3rd party HES players).
Register sampling. I was only able to test mednafen, yame, te, and ootake. Well, I tested ME, but didn’t record it (it’s just what you expect). For low frequency stuff, the less accurate ones do fairly fine – relatively speaking. But for higher rate of register updating, only Mednafen and TE were correct. Mednafen supposedly caps pretty high (I don’t remember the rate, but you’ll never hit it). I think TE caps about 32khz. That is, reg updates are pulled twice a scaline (probably divided evenly in half of the scanline. Once at the beginning, once at the end).

Down sampling. All the emulators appear to have some sort of method to deal with this. Blargg uses band limited step synthesis (and mednafen uses Blargg’s blip sound engine IIRC). Yame appears to be doing something of the same. Ootake is doing it too, but it looks too exaggerated. That is, the throws seem too extreme. TE is a bit higher in the throw like Ootake, but not as extreme. But more than mednafen/blargg, ME, and yame.

Now, this is the first test: Here (wave file).
(right click and save. Do not normal click and “stream”. Direct streaming doesn’t work).

This is Jackie Chan HES file. The part in question is a 7khz 5bit sample being played (IIRC track #89 or 90). In order of first to last: Blargg player, Ootake, Blargg down sampled to 4bit (not 5bit), mednafen, Turbo Engine.
A pic here: Here.
Right away, you can see all but 1 of them have the same symmetrical shape. The the positive and negative swings of the are about proportioned. Now, not all audio will look like that – but most will. What we are interested in, is that Ootake is different from the others. This is key. This means Ootake is suppressing the audio on one side of the swing/throw (from peak to trough). This is a repeating pattern with Ootake. Now, it doesn’t look like much. But when playing the ootake recorded part, against the 4bit blargg converted part – the ootake recording sound even more static-y/noisy than the 4bit conversion. That shouldn’t be. That definitely shouldn’t be.

Two pics comparing the band limited step synthesis of Ootake and Blargg/mednafen:
Here and here.
Top one is Ootake. I circled the spots in red, so you can see what I’ve talking about.
Here’s some reading about it:
The second link shows how to implement. It’s cost too much calculate this in realtime (or just wasteful), so you build out pre-calculated “steps”. The whole reason for this, is to keep frequencies above your sound card output, of PSG or any audio that’s generated at a higher frequency, from folding/inverting into the opposite(downward) frequency band. Nyquist frequency artifacts.
While I doubt this is a big deal that Ootake seems to have some extreme steps for such low res samples changes, it is interesting to note. If I were to take a logic guess, I would say Ootake is using fewer/coarser steps.

Back to the waveform itself. At 7khz, it shouldn’t be much of a problem to capture those reg updates (writes to $806). You’d have to sample at about 14khz to capture all of them. 15.7khz is the scanline frequency, and most emulators I would think – would probably capture the reg updates once per scanline (giving 15.7khz sampling). More than enough for 7khz TIMER sample playback.

Now to the second recording example. This time, both volume and reg sample rate is extremely important. This is from a rom, so blargg player couldn’t be used at this time. But mednafen substitutes fine. So from first to last, mednafen, ootake, yame, turbo engine.
A little info on this rom. It’s a software ADPCM decoder that outputs a 12bit sample clipped to 10bit (last two bits are dropped). Two DACs paired provide the 10bit linear range. The output rate is a little bit above 33khz. An odd number I now, but it was based on cpu cycle timing.
Wave file here. (again, right click and save as…)
And the rom file: http://alexandria66.2mhost.com/~pcengine//sound/lonely_soldier_boy.pce
And finally, a pic: here.

Again, we can see the same trend in Ootake in this recording. The amplitude of the waveforms are pretty loud this time, so you can see the effect even more now. Ootake is definitely crippling one side of the waveform (doesn’t matter what channel you look at, left or right). Ootake audio actually appears to be inverted from all the other emulators. But this isn’t a problem (it sounds the same). As long as both outputs are both inverted, it’s the same as non inverted. It’s only when one is inverted to the opposite channel, do you get canceling out of the audio frequencies.

When you zoom in, you can see strange artifacts in Ootakes recording. Oddly enough, you can see the same for Yame. Both play static-y because of this, but Yame is a little clearer because it outputs an uncrippled waveform (relatively speaking). But zooming in also reveals something else. It seems all emulators get the volume correct. Because, looking between the artifacts in the recordings, the waveform output is really high and almost correct (if it was placed in its relative place to the rest of the waveform). Not quite correct, but on that scale – damn near enough.

Here’s a zoomed in pic of each part right at the beginning: here.
Top is Ootake, middle is mednafen, bottom is TE.

Looking at the top one, you can see the parts I circled to point out the incorrect position of the waveform. The weird thing is, if you slide the parts inbetween these artifacts up or down, the waveform will be in the correct spot. So more than just missed sample writes ( missed sample writes which probably accounts for some of the coarser points parts in that pic for Ootake). This artifact is also (almost identical) present in Yame.

What does the ADPCM rom test tell us? It shows a continuing trend in Ootake for the incorrect waveform output, but also shows that Yame and Ootake can’t handle higher reg updates. You might think, “well, that’s not really a problem since I’m not doing 32khz+ sample output”. But this effect trickles down. It might not be very audible for 7khz output, but it does effect the “phase” of normal channels. That is, if a specific phasing of two channels is required to get a specific sound (whether this was done on purpose with exact timing, or on accident but you kept it because it sounded good), it can vary depending on when the second (or third or fourth) channel(s) reg gets written to, to start the output of that channel. This is what I was running into in the XM player. Certain phasing effects sounded either completely off or non existent on anything other than the real system or mednafen. To safe guard against this, cache your channel updates to happen ALL at the same time, separate from the parser. I have yet to do this for the XM player, but I will. I most definitely will. Strangely enough, the PCM driver is cached, but that’s to reduce jitter as well as allow VDC INT to operate at the same time 🙂

Conclusion: Mednafen is king. Blargg player is king too. YAME will never be updated and is only used as a reference between these emulators (though it still holds its own compared to Ootake). And Ootake needs some more audio work before I can recommend it. TE, well – it has a few bugs in weird areas of audio, but most of it spot on. ME – ok for games and simplistic HES files I guess and it doesn’t have the major audio issue like I encountered in Ootake, but I can’t really recommend it. It’s not as bad as nesticle for the NES emu scene by comparison, but it’s close (relatively speaking).

If anyone has any different results with Ootake, please post them in the comments section. I’ve love to be able to count on Ootake (the more accurate emus, the merrier) for sound listening (HES or custom roms). I really wasn’t expecting the output I got, so I’m thinking it might have something to do with my setup. If someone can record that ADPCM demo rom from their setup of Ootake and compare it, and post if the problem is present or not. Thanks 🙂


One Response to “Emulator audio differences..”

  1. I want to comment on non DDA playback of Ootake. Buffer sample mode appears to play fine on Ootake from the few tests I’ve done. It’s just any DDA writes/stuff.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: