- Length: ~30 Seconds, Size: 170KB
- Length: <5 Seconds, Size: 17 KB
Please note that the original voice files include some lip synch data or whatever, I had to strip them out. Maybe there are still some meta-data in it, that would explain why the relative size of the first file is considerably smaller but still sounds much better.
The second file sounds pretty bad. If I listen to the file via "xWMAPlay" (which can be found http://www.din.or.jp/~ch3/down_e.html), it just sounds like a 11Khz WAV audio file from 1995. Converting it to WAV and playing with the normal Windows Media Player (using http://www.nazizombies.com/phpBB3/viewtopic.php?f=11&t=2511#p21960, it sounds way better. You can still hear compression, but it sounds brighter, now it's just people hissing, but not into tin cans anymore

Why is this? As far as I can see, the "xWMAPlay" makes direct use of the DirectX API whereas the Windows Media Player most probably uses other techniques.
So ... what now?
