Find, For-loops, and Spaces in BASH

# Filed on May 8, 2006 by Anthony 10 replies

One of the great strengths of the Unix/Linux system is its ability to process multiple files at the same time using the shell (usually the BASH shell).

For example, every time I buy a CD, I copy it onto my system using cdparanoia, so that I have a backup of the CD, and so that I can listen to it on my computer.  After doing this for a few years, I discovered flac, the Free Lossless Audio Codec, which compresses WAV files pretty well -- generally to about 70% of the original size.  At first that doesn’t seem like too terribly much, but when you have ~100GB of data, getting ~30% of that space back is pretty exciting.

To flac all the WAV files in a single directory, all you have to do is:

flac *.wav

And even better, to flac all the WAV files in an entire directory tree (i.e. including subdirectories), it’s not much harder:

find /music/cds/ -type f -iname '*.wav' -exec flac "{}" \;

But what if, in addition to flac-ing all those WAVs, we’d like to make MP3s from them too?  We run into a problem here because the best MP3 encoder there is ("lame") does not allow you to just say "*.wav" to encode all the files in a given directory.  Instead, you must specify both the input filename (the WAV file) and the output filename (the MP3 file) for each operation.  Since the find command uses "{}" to mean "the name of the current file," we can just do:

find /music/cds/ -type f -iname '*.wav' -exec lame "{}" "{}".mp3 \;

But since every "{}" will be something like foo.wav, that means our output MP3 files will all be named something like foo.wav.mp3, which is ugly and not ideal.  What we’d really like to do is remove the .wav before adding the .mp3.  That’s trivial using sed:

ls *.wav

foo1.wav
foo2.wav
foo3.wav
...

ls *.wav |sed 's!\.wav$!!'

foo1
foo2
foo3
...

But now we have another problem: if we pass that sed command to the find command using -exec, the output is a STDOUT stream instead of a nice variable like "{}" that we can use and re-use within a single command.  The obvious solution is to use a BASH for-loop (broken onto multiple lines here for readability):

for song in $(find /music/cds/ -type f -iname '*.wav' |sed 's!\.wav$!!');
do lame "$song.wav" "$song.mp3";
done

That’s a bit more work than "lame *.wav" but I think it’s still pretty straightforward at this point: you just have to remember for (list of songs); do (something); done.  However we now have another problem.  The command substitution / shell expansion provided by "$()" above (which is equivalent to using backticks: `find...`) gives its output in an inconvenient format: instead of giving it to us one line at a time, it gives it one word at a time.  That is, it splits its output into words rather than into lines.

If none of your files or folders have spaces in their names, then this isn’t a problem, but that is increasingly unlikely nowadays.  If your songs do have spaces in their filenames, then the solution is to change how the shell splits that output.  This is controlled by the variable $IFS, which is set to spaces, tabs, and newlines by default.  We’d like it set to just newlines, so we do:

export IFS=$'\n'

Now $(find...) will give us output the way we really want it, and we now have our complete solution for encoding multiple files in multiple directories from WAV to MP3 format (broken onto multiple lines here for readability):

export IFS=$'\n';
for song in $(find /music/cds/ -type f -iname '*.wav' |sed 's!\.wav$!!');
do lame "$song.wav" "$song.mp3";
done

A few notes in closing: the command "export IFS=$’\n’" will only affect the current BASH instance, so if you’re running it in an Xterm within your GUI, you don’t have to worry about it affecting anything else.  But if you’d like to reset it to the default even for just the current shell, set it with "export IFS=$’ \t\n’" to set it to space, tab, and newline.

Note that both flac and lame leave the original WAV file intact and unchanged.  For making MP3s this is probably what you want, since MP3 is a lossy compression format so you’d like to keep the full-quality WAV file around.  But since flac is a totally lossless format -- that is, you can flac a WAV file and then use flac -d to decompress it and get the exact same original WAV file back -- you may want to delete the original WAV file once you’ve got the flac’d version of it.  To do that, use the --delete-input-file option on the flac command line.

In fact, in all the flac and lame examples above, I omitted any options for the sake of simplicity.  But normally I use the following options:

flac --best --replay-gain --delete-input-file foo.wav

This uses the best compression possible, enables the replay-gain feature to normalize the loudness of songs from different albums, and deletes the WAV file when finished creating the FLAC file.  And for lame I use:

lame --nohist --preset standard foo.wav foo.mp3

This prevents lame from displaying the frequency histogram while encoding, and uses the "standard" preset to set the output file’s quality & size (uses VBR and gives high quality with only moderately high file sizes).

Comments:

01. Jul 15, 2006 at 11:02pm by Anonymous Coward:

Great script.  Using it myself to convert shn’s to flac’s (many thanks).  One tip, instead of using sed to strip the extension, use ’basename’.

02. Jul 16, 2006 at 12:11am by Anthony:

Thanks for the tip!  I’m familiar with basename for removing the path from an item, but I didn’t realize that it could remove the extension too.

03. Dec 7, 2006 at 03:09pm by Tom:

Thanks:)

04. Mar 16, 2007 at 04:45pm by Frans Hondeman:

Another nice way to get separate filenames from their extensions can be done in bash itself, removing the need for some external program:

$filename="test.file.jpg"
...
EXT=${filename#*.} # gives ".jpg"
BASE=${filename%.*} # gives "test.file"

05. Mar 28, 2007 at 09:15am by faruk:

hi,
i am new with bash script.
I want some help from u.

i want to sequencialy two fields from a file like that
for a in $1; b in $2 filename
do
echo $a, $b
done

my question is how can i use two ’in’ option in a for loop?

thanks ur above information

06. Jan 30, 2008 at 07:22pm by peter:

great script !!! :-)
But how do i remove the old wav file ?
something rm  , but i am new to scripting....
thanks Peter

07. Feb 1, 2008 at 04:50pm by AnthonyDiSante:

I think simply rm "$song.wav" should do it.

08. Feb 17, 2008 at 06:49pm by Dan:

You should read the man page on xargs and find, and look at the -0 and -print0 options respectively (also the -d option in xargs). Also, you don’t need sed to change the file exstension; basename can be used for this too:  "$(basename "$f" .wav).mp3"

09. Jul 10, 2008 at 05:29pm by farnon:

could you explain the $’’ quoting in the line

export IFS=$’\n’

I noticed export IFS="\n" doesn’t work, that sets the letter "n" as the delimiter. I’m a little confused.

10. Aug 16, 2008 at 07:19am by dunno:

Thank you very much. Even though my scripting is very limited i got it running! but i’d like to add for other noobs

at least on debian-systems...
TO CREATE A .sh-FILE FROM IT:
- copy and paste the 7th code-box into a text-file which you name convert.sh
- delete the directory IN the script "/music/cds/"
- goto the directory of the file in the console and type "chmod +x convert.sh" and press enter
- now move your convert.sh to any directory with .wav-files and run the script from console with "sh convert.sh"

this did it for me!

but this was the - by far - easier part i think ;-)  so thanks again!!

Reply to this message here:

Your name
Email (why?)
Website (if you have one)
Subject
search posts:

home | archives ]