« previous: Thunderbird Default SMTP Server Bug: Solved! (Sort Of...) | next: Regex Bug in Apache's mod_include (SSI) »
One of the great strengths of the Unix/Linux system is its ability to process multiple files at the same time using the shell (usually the BASH shell).
For example, every time I buy a CD, I copy it onto my system using cdparanoia, so that I have a backup of the CD, and so that I can listen to it on my computer. After doing this for a few years, I discovered flac, the Free Lossless Audio Codec, which compresses WAV files pretty well -- generally to about 70% of the original size. At first that doesn’t seem like too terribly much, but when you have ~100GB of data, getting ~30% of that space back is pretty exciting.
To flac all the WAV files in a single directory, all you have to do is:
flac *.wav
And even better, to flac all the WAV files in an entire directory tree (i.e. including subdirectories), it’s not much harder:
find /music/cds/ -type f -iname '*.wav' -exec flac "{}" \;
But what if, in addition to flac-ing all those WAVs, we’d like to make MP3s from them too? We run into a problem here because the best MP3 encoder there is ("lame") does not allow you to just say "*.wav" to encode all the files in a given directory. Instead, you must specify both the input filename (the WAV file) and the output filename (the MP3 file) for each operation. Since the find command uses "{}" to mean "the name of the current file," we can just do:
find /music/cds/ -type f -iname '*.wav' -exec lame "{}" "{}".mp3 \;
But since every "{}" will be something like foo.wav, that means our output MP3 files will all be named something like foo.wav.mp3, which is ugly and not ideal. What we’d really like to do is remove the .wav before adding the .mp3. That’s trivial using sed:
ls *.wav foo1.wav foo2.wav foo3.wav ... ls *.wav |sed 's!\.wav$!!' foo1 foo2 foo3 ...
But now we have another problem: if we pass that sed command to the find command using -exec, the output is a STDOUT stream instead of a nice variable like "{}" that we can use and re-use within a single command. The obvious solution is to use a BASH for-loop (broken onto multiple lines here for readability):
for song in $(find /music/cds/ -type f -iname '*.wav' |sed 's!\.wav$!!'); do lame "$song.wav" "$song.mp3"; done
That’s a bit more work than "lame *.wav" but I think it’s still pretty straightforward at this point: you just have to remember for (list of songs); do (something); done. However we now have another problem. The command substitution / shell expansion provided by "$()" above (which is equivalent to using backticks: `find...`) gives its output in an inconvenient format: instead of giving it to us one line at a time, it gives it one word at a time. That is, it splits its output into words rather than into lines.
If none of your files or folders have spaces in their names, then this isn’t a problem, but that is increasingly unlikely nowadays. If your songs do have spaces in their filenames, then the solution is to change how the shell splits that output. This is controlled by the variable $IFS, which is set to spaces, tabs, and newlines by default. We’d like it set to just newlines, so we do:
export IFS=$'\n'
Now $(find...) will give us output the way we really want it, and we now have our complete solution for encoding multiple files in multiple directories from WAV to MP3 format (broken onto multiple lines here for readability):
export IFS=$'\n'; for song in $(find /music/cds/ -type f -iname '*.wav' |sed 's!\.wav$!!'); do lame "$song.wav" "$song.mp3"; done
A few notes in closing: the command "export IFS=$’\n’" will only affect the current BASH instance, so if you’re running it in an Xterm within your GUI, you don’t have to worry about it affecting anything else. But if you’d like to reset it to the default even for just the current shell, set it with "export IFS=$’ \t\n’" to set it to space, tab, and newline.
Note that both flac and lame leave the original WAV file intact and unchanged. For making MP3s this is probably what you want, since MP3 is a lossy compression format so you’d like to keep the full-quality WAV file around. But since flac is a totally lossless format -- that is, you can flac a WAV file and then use flac -d to decompress it and get the exact same original WAV file back -- you may want to delete the original WAV file once you’ve got the flac’d version of it. To do that, use the --delete-input-file option on the flac command line.
In fact, in all the flac and lame examples above, I omitted any options for the sake of simplicity. But normally I use the following options:
flac --best --replay-gain --delete-input-file foo.wav
This uses the best compression possible, enables the replay-gain feature to normalize the loudness of songs from different albums, and deletes the WAV file when finished creating the FLAC file. And for lame I use:
lame --nohist --preset standard foo.wav foo.mp3
This prevents lame from displaying the frequency histogram while encoding, and uses the "standard" preset to set the output file’s quality & size (uses VBR and gives high quality with only moderately high file sizes).