WAV and soft-boiled eggs

From YobiWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Intro

We've seen in BMP_PCM_polyglot that we can merge a BMP and a WAV in a single file.
Here we will merge two 16-bit WAV files in a single 32-bit file.
Depending how you interpret the endianness of the file you'll hear one or the other WAV, considering that with a maximal dynamic range of about 96dB (16-bit resolution), whatever below that level cannot be heard.

Example

Tribute_rogers_stamos.wav is such a WAV.
Sound quality is not great but that's from the original source, not due to our manipulations.

Let's display its spectrogram and discover a first easter-egg:

sox Tribute_rogers_stamos.wav -n spectrogram

Tribute rogers stamos spectrogram1.png

Now if we change the endianness of the interpretation:

sox -t raw -r 44100 -c 1 -B -e signed -b 32 result.wav -n spectrogram -z 30 -Z -35

To do se you can see we've to interpret the WAV as a RAW file and provide manually its characteristics (44kHz, mono, 32-bit signed and, most important, big-endian).
The options -z and -Z are there to adjust the displayed dynamic range and get a more pleasant result.

Tribute rogers stamos spectrogram2.png

Making-of

I used a short sequence featured in a TV news I found on Youtube about NSA Director Mike Rogers vs. Yahoo! on Encryption Back Doors:

youtube-dl "https://www.youtube.com/watch?v=s5GN1heBRLg"

I extracted the audio sequence starting at 1:50 and lasting 14 seconds:

mplayer -quiet Yahoo\,\ NSA\ battle\ over\ encryption\ access-s5GN1heBRLg.mp4 -ao pcm:fast:file=interview.wav -vc dummy -vo null -channels 2 -ss 1:50 -endpos 0:14

I made it mono, 44kHz, 16-bit signed and I normalized it:

sox interview.wav -r 44100 -c 1 -e signed -b 16 interview16.wav norm

Its spectrogram:
Tribute rogers stamos spectrogram orig.png
Even if the video contained a 44kHz audio, we see that the original source was only 22kHz. It looks like we've plenty of room to draw but remember that whatever you put below 18-20kHz (depending on people and their age...) can be heard so we can only use the very high frequencies.

I prepared a simple banner in Gimp, with text in grey on black background. Grey means reduced audio signal, less chance for it to be audible.

Tribute rogers stamos you cant.png

I used again AudioPaint to convert it to a sound in the high frequencies, above most humans hearing range:

  • open AudioPaint
  • File/Import picture... Tribute_rogers_stamos_you_cant.png
  • Audio/Audio settings... L/R=brightness/none minfreq=19000 maxfreq=22050 scale=linear duration=14
  • Audio/Generate...
  • File/Export Sound... Tribute_rogers_stamos_you_cant.wav

AudioPaint always generates stereo files but we used only the left channel, so we need to isolate it:

sox Tribute_rogers_stamos_you_cant.wav -c 1 Tribute_rogers_stamos_you_cant1.wav remix 1

Here is its spectrogram:
Tribute rogers stamos spectrogram msg.png

Now we can merge the interview WAV with the high frequencies one and code it over 32 bits:

sox -m interview16.wav Tribute_rogers_stamos_you_cant1.wav -b 32 result1.wav

To hide the second spectrogram, choose a suitable image, convert it to a WAV of 14s with AudioPaint, now with no minfreq, and save the result as result2.wav.
Finally, we can merge the left channel of 16-bit stereo result2.wav into the 32-bit mono result1.wav with a few lines of Python:

#!/usr/bin/env python
 
from struct import unpack, pack
 
wav_in1 ='result1.wav'
wav_in2 ='result2.wav'
wav_out ='result.wav'

with open(wav_in1, 'rb') as wav1:
    w1=wav1.read()
header=w1[:w1.index('data')+8]
data=w1[w1.index('data')+8:]
with open(wav_in2, 'rb') as wav2:
    w2=wav2.read()
data2=w2[w2.index('data')+8:]
outdata=""
for i in range(len(data)/4):
    if i*4<len(data2):
        # Swap spectrogram to Big-Endian
        outdata+=data2[(i*4)+1:(i*4)+2]+data2[(i*4):(i*4)+1]
    else:
        outdata+='\x00\x00'
    outdata+=data[(i*4)+2:(i*4)+4]
with open(wav_out, 'wb') as wavout:
    wavout.write(header+outdata)

And voilà!