WAV and soft-boiled eggs

From YobiWiki
Jump to navigation Jump to search


We've seen in BMP_PCM_polyglot that we can merge a BMP and a WAV in a single file.
Here we will merge two 16-bit WAV files in a single 32-bit file.
Depending how you interpret the endianness of the file you'll hear one or the other WAV, considering that with a maximal dynamic range of about 96dB (16-bit resolution), whatever below that level cannot be heard.


Tribute_rogers_stamos.wav is such a WAV.
Sound quality is not great but that's from the original source, not due to our manipulations.

Let's display its spectrogram and discover a first easter-egg:

sox Tribute_rogers_stamos.wav -n spectrogram

Tribute rogers stamos spectrogram1.png

Now if we change the endianness of the interpretation:

sox -t raw -r 44100 -c 1 -B -e signed -b 32 result.wav -n spectrogram -z 30 -Z -35

To do se you can see we've to interpret the WAV as a RAW file and provide manually its characteristics (44kHz, mono, 32-bit signed and, most important, big-endian).
The options -z and -Z are there to adjust the displayed dynamic range and get a more pleasant result.

Tribute rogers stamos spectrogram2.png


I used a short sequence featured in a TV news I found on Youtube about NSA Director Mike Rogers vs. Yahoo! on Encryption Back Doors:

youtube-dl "https://www.youtube.com/watch?v=s5GN1heBRLg"

I extracted the audio sequence starting at 1:50 and lasting 14 seconds:

mplayer -quiet Yahoo\,\ NSA\ battle\ over\ encryption\ access-s5GN1heBRLg.mp4 -ao pcm:fast:file=interview.wav -vc dummy -vo null -channels 2 -ss 1:50 -endpos 0:14

I made it mono, 44kHz, 16-bit signed and I normalized it:

sox interview.wav -r 44100 -c 1 -e signed -b 16 interview16.wav norm

Its spectrogram:
Tribute rogers stamos spectrogram orig.png
Even if the video contained a 44kHz audio, we see that the original source was only 22kHz. It looks like we've plenty of room to draw but remember that whatever you put below 18-20kHz (depending on people and their age...) can be heard so we can only use the very high frequencies.

I prepared a simple banner in Gimp, with text in grey on black background. Grey means reduced audio signal, less chance for it to be audible.

Tribute rogers stamos you cant.png

I used again AudioPaint to convert it to a sound in the high frequencies, above most humans hearing range:

  • open AudioPaint
  • File/Import picture... Tribute_rogers_stamos_you_cant.png
  • Audio/Audio settings... L/R=brightness/none minfreq=19000 maxfreq=22050 scale=linear duration=14
  • Audio/Generate...
  • File/Export Sound... Tribute_rogers_stamos_you_cant.wav

AudioPaint always generates stereo files but we used only the left channel, so we need to isolate it:

sox Tribute_rogers_stamos_you_cant.wav -c 1 Tribute_rogers_stamos_you_cant1.wav remix 1

Here is its spectrogram:
Tribute rogers stamos spectrogram msg.png

Now we can merge the interview WAV with the high frequencies one and code it over 32 bits:

sox -m interview16.wav Tribute_rogers_stamos_you_cant1.wav -b 32 result1.wav

To hide the second spectrogram, choose a suitable image, convert it to a WAV of 14s with AudioPaint, now with no minfreq, and save the result as result2.wav.
Finally, we can merge the left channel of 16-bit stereo result2.wav into the 32-bit mono result1.wav with a few lines of Python:

#!/usr/bin/env python
from struct import unpack, pack
wav_in1 ='result1.wav'
wav_in2 ='result2.wav'
wav_out ='result.wav'

with open(wav_in1, 'rb') as wav1:
with open(wav_in2, 'rb') as wav2:
for i in range(len(data)/4):
    if i*4<len(data2):
        # Swap spectrogram to Big-Endian
with open(wav_out, 'wb') as wavout:

And voilà!