There is a lot of power on the command line, but it can be difficult to master. I hope to give you a taste of that power with a survey of cool things you can do with the find
utility on most unix like systems.
Bash Power Commands for Mere Mortals
When you provide the -exec
switch to find you can process multiple files at once, in any way you like: resize images, rename mp3s, add entire directory trees to your workgroup. You name it and find -exec
can do it - well, name something you can do with bash :)
Actually Finding Stuff
NOTE: If you already know the basics of the common linux command line find
utility, you can skip to the next section.
Find, simply put, finds files. If you ever forgot where you put your 1994 tax return, you could use find to find it. Like most commands in bash, all of it’s options are layed out for you in the manual page (see man find
). Also like most commands in bash, the options are quite arcane and so plentiful that the uninitiated are left rather bewildered. Let me help sort that out for you.
First off, you’ll need to remember that the first thing you must tell find is where to begin looking. If you don’t tell it where to begin the search, how can you ever expect to find anything in an efficient manner? So, very simply, you could use find
as a sort of alternate form of ls
like this
|
|
And find will dutifully list out all the files in folders it can get its paws on. This is the absolute most basic of basic ways to use find.
By Name
Lets try something a little more useful. Next up, we have the -name
switch which will allow us to use glob patterns to search for files. You’re probably familiar with glob patterns from DOS and Bash already, things like ls *.jpg
to list all jpeg files. Well, lets see what the equivalent to that looks like with find.
|
|
This will only list out files that end with .mp3
. If you have other file types then you might want to try one of these:
|
|
By Regular Expression
As you can see the possibilities for file name matching are only limited by your imagination - well, your imagination and the limits of glob patterns. I personally find glob patterns to be rather limiting, and some people forget that glob patterns are not regular expressions because they look so similar. Lucky for us find lets us use regular expressions as well. We can do the exact same searches as above, but with the -regex
switch instead of the -name
switch, like so:
|
|
Being Insensitive
Ok that’s nice, but seriously, I can’t remember the file name, so a case sensitive match doesn’t work for me! No problem, just use the case insensitive version -iname
. So if you wanted to find that 1994 tax return and you had some vague idea of what it was probably named, you might try something like this.
|
|
and -iregex
works about the same way
|
|
This regex should find examples such as “1994 - State Tax Return.doc” or “l33t hax0R taXeS from 1994.pdf”
Doing Stuff With What You Find
First off, think of find
as a tool for not for “finding” but for “choosing” what you want to work on. The power of the find command becomes more obvious when we start being able to apply commands to each file. This means we can do things in batches, like we could rename a bunch of files from .txt
to .doc
or we could use imagemagick to resize a bunch of pictures for example.
A Case Scenario
Lets say we have some mp3s of Bobby McFerrin’s album “Don’t Worry Be Happy” and we want to add his name to the beginning of each mp3 file so that we’re more organized. If you do this through a GUI you’re going to be sitting there for a long time typing the same thing over and over again, clicking waiting, typing, over and over. If you do this with the shell, you will merely have to think for a small moment to write the proper command, and then everything is done for you. If you don’t like the result, it’s just as easy to change things back to the way they were, you can either run the reverse command, or you can just delete the copy you were working on. You, um, were working on a copy… right? :P
I’m going to pretend that I have a folder named “Don’t Worry Be Happy” and that in it, I have several mp3 files from Bobby McFerrin. In my case I just have some example files named foo.mp3
and bar.mp3
etc.. Now the first thing to do, is to get find to find the right files. Lets write an example find command and check its output to be sure that we’ll be operating on exactly the files we want to change. I will run something like this:
|
|
That seems reasonable right? I think so too, lets look at the output
|
|
Trimming Unwanted Folders
Uh oh, look at that, the find command is including the actual folder in the list of files it found! We don’t want to change the folder name, so lets update this command to only find files.
|
|
And now we get
|
|
That’s much better. Of course, we could have also used a glob pattern with -name
or a regular expression, but I wanted to illustrate another handy feature of the find command, which is that you can tell it to only select files or only select directories, this can be handy for normalizing permissions as you’ll see later.
Running Some Test Commands
Now lets start getting something done with these files. Before we do anything real we’ll run a test command just so we can double check that our command is running the way we want it to.
To run a command on each file found, we use the -exec
switch.
NOTE: There is a small quirk about the -exec
switch: you must tell -exec
where the command ends. Most languages do this with a semi-colon ;
but bash will gobble up your semi-colon before giving it to -exec
because bash is greedy and bash thinks that you mean you want to run some other command after running find. To prevent bash from swallowing our semi-colon, we need to escape it with the backslash like this \;
and then the semi-colon will be given to -exec
and everyone is happy.
|
|
and we get
|
|
As you can see, still not much is happening yet, but we are getting there. The main thing to notice is that {}
is the placeholder for the current file name. It means “put the current file name right here as if I had just typed it out”. Lets do something a little more meaningful next, lets actually put Bobby McFerrin’s name in front of each file.
Changing The File Name
|
|
NOTE: most people usually use forward slashes with sed for substitution, but I prefer to use #
by default because then you don’t need to escape your slashes. This is especially useful when dealing with file names.
Uh, oh what’s this?
|
|
Running Multiple Commands
Well, yes we do need to escape the semi-colon because of bash, but once we start getting a little more complicated, things break down again. The basic solution is to feed the entire string of commands to bash directly. Bash can be invoked with the -c
switch which does much the same thing as -exec
, namely it runs a command (“c” for “command”, get it?).
|
|
Doing this is certainly not as pretty, and instead of using {}
in our list of commands, we use bash as we would if we were writing a script, in other words we pull our current file name from the first argument provided to bash which is always stored in the $1
variable. We then provide {}
as the first argument to bash (the _
takes the place of the script filename since there isn’t one). Notice that the entire script is now wrapped in single quotes, we do this so that -exec
knows what to feed bash -c
as the command and what to feed as the arguments.
Ok so what does that command give us?
|
|
Hmm, not quite what we wanted. If we had wanted to rename the entire directory we could have done that, no problem and that would have only taken a single command. The filename in this case is the entire file path, starting with the directory name, so simply using ^
in the regular expression to replace the beginning of the file name is not going to work, we’ll need to be more specific. Luckily, this is a very simple and easy change because I’m already using #
signs for the substitution so no escaping is needed, just change the ^
caret to a /
forward slash.
|
|
Now this command gets is the following output
|
|
Awesome, much better, now to write up a full command to actually rename these files.
Testing The Real Thing
|
|
And that gives us
|
|
Refactoring
Lets refactor just a touch, so our script is more friendly and easier to work with. Using single quotes to feed the command to bash means that whatever we put between those single quotes may span as many lines as we wish. Also we’re getting happy with the quotes. Bash likes quotes, always quote more, not less, especially when you’re dealing with files that have spaces in their names.
|
|
now we get
|
|
Renaming For Real
Cool, lets try this thing without a safety net, go ahead and add in the move command - don’t forget your quotes. We’ll also put in an ls
before and after, so we can double check our results.
|
|
And that gives us
|
|
A Better Alternative
Awesome! Er… kind of… so you mean I have to type out all that gook every time I want to rename some files? Luckily, no. There is a perl utility called, oddly enough, rename and it makes renaming files like this a breeze. If you’re on Ubuntu, then you’ve already got rename
, it lives in /usr/bin/rename
and it comes from the util-linux
package. Otherwise you’ll need to build it yourself or download one of the many prebuilt versions you can find on the net.
Command Glue
There is a small and important distinction and option that you have with -exec
and that is whether to run separate commands or one big long command. If you use the semi colon with exec, a separate command will be run for each file. If you use the plus sign, one big command will be run with all the files being provided on one line to the command.
Separate
For example if you wanted to delete some files, you could do something like this
|
|
and if there were three files, foo, bar and baz, then behind the scenes it would look something like this
|
|
Together
If, however, you use a plus sign instead of the semi-colon (and you don’t need to escape the plus sign by the way)
|
|
then find will put the list of files all on one line and run only a single command, so behind the scenes it would look something like this
|
|
This only works if your list of files is the last argument, you can’t use this with the move command for example because
don’t do this, this is bad mmmmkay?
|
|
just isn’t correct. So if you want to use +
then {}
must be the last argument to your command.
The main consideration here is that if you have a lot of files, it will probably be faster to run the command once, giving it all the file names, than it will be to run the command on every file individually.
Some Common Useful Stuff
So renaming was kind of pointless, except to learn the glorious nirvana of find, now what more can we ask for?
Normalize file and folder permissions.
|
|
Sometimes permissions get all wacky and if you want to set them back to relatively sane defaults you could use something like this.
Giving Ownership to Your Team.
If you wanted to for example add a folder to your workgroup so that all members have and will continue to have access to the files in it, you may wish to set the sticky bit.
|
|
The sticky bit is only relevant for folders, so we use the find command to specify only folders with -type d
. Granting write permissions is easily done with chmod
by itself using the -R
switch meaning recursive.
Resize a bunch of JPEGs
|
|
You can easily and quickly strip out identifying information like date, time or GPS coordinates from your pictures to protect your personal privacy with the -strip
switch.
Swap out sets of vhosts
|
|
Lets say you’d like to be able to switch back and forth between Apache HTTPD as your main web server, and putting apache behind a reverse proxy with nginx while you learn about reverse proxies and nginx and get your configuration set up just the way you want it. If you make two sets of vhosts and have the names for each type begin with some unique identifier like “proxied” then you can easily use find to switch back and forth between the two different vhost configurations.
To switch between the proxied and non-proxied styles of apache all you need to do is delete the symlinks to the proxied vhosts and re-symlink the non-proxied vhosts. This is what the above example accomplishes.
Cover Photo
Credit: Markus Spiske Iar