Regular Expressions

Perl Compatible Regular Expressions provide flexible and powerful pattern matching. You can get started with a few basic concepts. As an example, lets say you have many photos of a family reunion with filenames like 2010-April-30 Family Reunion Granny Smith.jpg and you want to create two custom tags for "subject" and "people". This was a very large reunion from April 28 - May 2.

The following examples use quotes around text to be taken literally. Do not include the quotes when entering text in a cell or the "Find what:" or "Replace with:" text boxes.

The first step is to select and copy all the filenames to the "people" column. The subject column is easy as it will be "Family Reunion 2010". Just enter "Family Reunion 2010" in the top cell, copy, select the remaining cells in the column and paste.

The "people" column is more difficult. The filename contains the names you want in the tag, but you only want the name of the person in the picture, not the date or anything else in the filename. Two simple search-and-replace operations could replace ".jpg" and "Family Reunion" with nothing, but the date is more complicated as there are several different dates.

Looking at all the filenames reveals a pattern: <date><" Family Reunion "><name><.jpg> Regular Expressions simplify matching such patterns.

After checking the "Expression" box in the "Replace" dialog, "." becomes a wildcard matching any single character (word, digit, punctuation, etc.). A "+" following another character means match 1 or more of the previous character. Searching for ".+" by itself would match the entire filename. Fortunately the "+" character is a team player and cooperates with whatever comes next in an effort to match the entire expression. Searching for ".+ Family Reunion " matches everything up until the name we want to keep. If you want to search for an actual "." precede it with "\", sometimes called an escape character.

Square brackets "[" and "]" enclose a set of characters that will match any one character in the text you're searching. "[a0%]" would match either one "a", one "0" or one "%". Use "-" to indicate a range. "[A-Z]" matches any one upper-case letter. "[A-Z0-9\-]" matches any one upper-case letter, number, or "-". Note the "\" before the "-" which means search for an actual "-" character. Search for "[ \-_]" to match common word separators. "[A-Z][a-z]+" would match an entire word that starts with an upper case letter.


In the previous example, we saw how to remove unwanted text and keep the rest. Regular Expressions also allow choosing what to keep and removing the rest. Enclosing text in parentheses "(" and ")" makes everything inside a group. Now we can search for ".+ Family Reunion (.+)\.jpg". Note the "\." to specify we want the actual "." character before the "jpg" file extent. Now the name we want to keep is grouped inside the parentheses. Since it's the first group, it is group 1. In the "Replace With" text box, enter "\1" to indicate the first group. Now the entire filename is replaced with just the name.

You can easily experiment with Regular Expressions using the "Find" button to highlight matched text. If you're satisfied with the match, click "Replace".

There's much more at perl-compatible regular expressions