Data-driven Categories

Data-driven categories use metadata and variables to automatically organize your files by things like camera model, lens, photographer, country, location, persons shown, band, author, ...

You may know similar concepts from other software under names like Smart Collections or Intelligent Collections.
But IMatch data-driven categories go way beyond that.


The unique data-driven category concept of IMatch allows you to categorize your files automatically based on EXIF, IPTC, XMP, ID3 and virtually all other metadata and Attributes IMatch maintains in the database. Some typical uses for data-driven categories are:

How Data-driven Categories Work

When you set up a data-driven category for a metadata field (tag), IMatch looks at all the different values stored for this tag in your database. For each unique value found, IMatch creates a child category under the data-driven category and assigns the files having that value to this category.

Data-driven categories are by default dynamic. If you add or remove files or you change metadata all you need to do is to refresh the data-driven category to make it represent the current contents of your database.

Example

Consider you manage an image collection with images taken with a wide variety of cameras from different vendors. You want to set up a category hierarchy with one category for each brand, something like this:

Brand and Model
 |-- Canon
 |-- Nikon
...

If you set up a data-driven category for this, IMatch does the following:

Multi-level Data-driven Categories

IMatch supports data-driven categories with up to 6 levels, which offers you a great deal of flexibility. A multi-level data-driven category is based on two or more different metadata values. Each value produces one level in the resulting category hierarchy.

Extending our example from above by including a second level, with the camera models used:

Brand and Model
 |-- Canon
   |-- Model 1
   |-- Model 2
 |-- Nikon
   |-- Model 1
   |-- Model 2
   |-- Model 3
...

If you set up a data-driven category for this, IMatch does the following:

More Examples

Another example would be a category hierarchy for your MP3 music collection. On the top-level you could use the Artist tag, below that the Album tag and below that the Title tag. The resulting category hierarchy gives you direct access to each MP3 file by artist, album and track title.

Or, if you fill out the XMP metadata Country, City and Location tags for your files, you can set up a data-driven category with country on level one, and below that the cities for each country and below that the locations within each city.

Creating a Data-Driven Category

Performance Tip:
IMatch has to update categories visible in the Category View or the Category Panel often. This may reduce performance if you create many complex data-driven categories.
Create all your data-driven categories under a common parent category. You can then collapse the parent category, hiding all data-driven categories to avoid unnecessary updates.

For this example, we create a data-driven category for camera make and camera model. The resulting category hierarchy will have two levels.

Start by creating a new category and give it a meaningful name. For our example, we add the category directly below @All and name it Make and Model.

Click on the category to select it and make sure the property panel (below the tree) is visible. In the property panel, click on the row labeled Data-driven and then click on the ... button to open the dialog box for the data-driven category configuration.

You can alternatively also use the corresponding command from the context menu of the category: Advanced Commands > Data-driven Properties...

Configure the First Level: "Camera Make"

In the Edit Data-driven Category dialog, click on the row labeled Tag and click the ... button to open the Tag Selector dialog box.

We use the Make tag from the Standard tab for this example:

You can also use non-standard tags by using the search function in the tag selector dialog box.

Double-click on this tag to add it to the result list and close the Tag Selector with OK. The selected tag will be inserted in the Tag row of the data-driven category dialog box:

Make sure the Enabled option is set to On to make this level active.

The OK and Preview buttons will become available when at least one level is enabled.

Using the Preview

This is all you have to do to define the first level of your data-driven category. To see the results of this level, click the Preview... button. This opens a new window which retrieves the data from the database and outlines it in a tree control. This is a great way to check the results and the settings you have made.

The small database used for this example contains images taken with cameras from 7 different vendors. Your results will vary.

Close the Preview dialog box to return to the Data-driven Category editor.

Configuring the Second Level: "Camera Model"

Data-driven categories support up to six levels. You can select the level you want to configure via the Level drop-down control at the top of the dialog. A level is only used for the data-driven category if you set the Enabled property to Yes. This property is the first in the property grid.

You can temporarily disable a level by setting the Enabled property to false. The level retains its settings so you can re-enable it later at any time.
There must always be at least one enabled level. You cannot close the dialog with OK when all levels are disabled.

In the Level drop-down control at the top of the dialog, switch to Level 2.

Enable the level by setting the Enabled property to Yes. This level will use the camera model metadata tag as the data source. Using the same steps as above, select the Model tag via the tag selector, again from the Standard tag list.

To see the result of the settings so far, click on the Preview... button again. The result for our sample database looks like this:

Your results will vary. Our sample database contains images from different photographers using different types of equipment. The Preview dialog is a great way to quickly see the results of your data-driven category, without forcing IMatch to actually create the categories in the database.

If your database contains many files (50,000 or more), generating the preview may take a few seconds.
You can close the Preview dialog at any time via the Close button even if IMatch is still loading and processing data.

Final Steps

The data-driven category is now complete. Click OK to close the Data-driven Category editor.

IMatch now stores the data-driven definition in the category Make and Model and then performs the database operations to produce the actual category hierarchy. Depending on the size of your database and the performance of your computer this may take a few seconds. When the operation has completed, the data-driven category will become visible in the Category View. It should look similar to this:

The Make and Model category has two levels of child categories, one for each level in the data-driven definition you've just created.

Sample Categories Shipped With IMatch

When you create a database, IMatch automatically adds a set of sample data-driven categories. You find these under the IMatch Sample Categories category:

The sample categories shipped with IMatch. Expanded is the data-driven category Location, with child levels for Country, City and Location. This data is extracted directly from the GPS location info in your image files. See the Map Panel for more info.

There are sample categories for images (ISO, Lens, Location, Make and Model), MP3 files (Artist, Album) and PDF documents (Author, Title). Use these sample categories go get some ideas about what can be done with data-driven categories in IMatch.

What is 'Other' and Why Is It Important?

The special Other element is important when you work with incomplete or partially missing metadata.

Back to our sample Make and Model category. What happens if an image has no make or model? Where do these files show up in the data-driven category?

You may want to enable the Other element on this or the 1st level to handle files without Make or Model information. The Other element will gather these files and assign them to a child category named 'Other' (or any name you choose). See the information on the Other property below for details. Other is basically a bucket which collects all files without a Make.

You should enable Other when you expect that not all files have a value on that level and you want to include these files in the data-driven category. If there is no Other, files without a value will not be included in the category. If you have 1000 image files, but only 200 have an entry in Make, the data-driven category will show only 200 files, unless you use Other to collect all files without a Make. Which may be useful, or not.

When You Must Use Other

Consider the following example: You set up a data-driven category with three levels:

Country
  City
    Location

IMatch retrieves the country, city and location metadata tag for each file in the database and assigns the files to their respective categories. To do this, it starts at the lowest level (Location) and works its way upwards to the Country level. A file with the values County:USA, City:Daytona, Location:Beach can be mapped easily:

USA
  Daytona
    Beach

But what happens for a file which has no location value? Consider a file with Country:UK, City:London but no Location. IMatch cannot create a node on the Location level for this file because it has no location. This file will thus not be included in the data-driven category!

But if enable Other for the the location level, IMatch puts the file into that category. And from that it can again go up to the next higher level (City) and and from that to Country. The file will thus end up in

UK
  London
    Other

This Other element 'collects' all files without a value on the Location level. And then these files will be rolled up to the top-level based on their City and Country values. If a file has no value for City, it will be assigned to the 'Other' element on the City level and so on, up to the top level.

The general rule is:

If you expect that some of your files have no value for one of the levels in your data-driven category, enable the option to use an 'Other' element. Unless you want to explicitly suppress these files.

Refreshing Data-driven Categories

When you make changes to your database, the contents of data-driven categories may become pending. This means it may no longer reflect the actual contents of your database and needs to be refreshed. Pending data-driven categories use a special icon:

A stale data-driven category

To refresh the category, select the category and press Shift+F5 or choose Refresh Data from the context menu. To refresh all data-driven categories at once press Shift+Ctrl+F5 or use the corresponding command from the context menu of any data-driven category.

When enabled under Edit > Preferences > Background Processing, IMatch automatically refreshes pending data-driven categories in the background, when it is idle.

Filtering

Data-driven categories may need to process a massive amount of data. Consider a database with 100,000 files. To find all files for a given camera Make, IMatch needs to load the metadata for 100,000 files, look at each Make value, create a list of all unique values for Make (Nikon, Canon, Sony, Panasonic, ...) and which files have that value. Then it has to repeat all that for the camera Model level. That's a lot of data to move around.

Especially for large databases with hundreds of thousands of files, reducing the number of files used for a data-driven category can be a real performance boost.

File Formats

In many cases you can reduce the number of files to process by restricting the category to a certain file format (or several). For example, when you are creating a data-driven category which uses MP3 tags like Artist and Album, you can reduce the files to process by setting the file formats filter to include only files with the .mp3 extension:

Limiting a data-driven category to selected file formats.

If you use this filter, IMatch calculates the contents of the level like for all other data-driven categories, but considers only files matching one of the given file extensions. This is very useful if you only want to analyze certain files, e.g. PDF documents, MP3 files, or images taken with a certain camera.

If you create data-driven categories which analyze specific maker note tags or other metadata that exists only in specific files (e.g. your RAW files), use the file formats filter to restrict the category only to files in that format.

Category Filter

Even more useful is the Category Filter. It allows you to limit the data-driven categories to files assigned to one or more other categories. For example, you may want to create a data-driven category based on the Title tag, but only for files assigned to your Family category or it's children. This gives you a data-driven category which groups your images based on Title, but only includes family photos.

Another typical example would be to limit a data-driven category to files with one or more specific keywords. Or to files created for a customer. Or files in one or more motive categories...

The Category Filter takes the names/paths of one or more regular expressions, separated by a semicolon. The supported syntax is the same as used in @Category formulas, which gives you a great deal of flexibility and control. For example:

@All|Family

All files in the Family category and it's children.

Claudia

All files from all categories containing the word 'Claudia'.

If you use more than one category in this filter (or your filter matches more than one category), the files in these categories are combined with the Boolean OR operator. This means that the result are all files, combined.

See the description of the @Category formula for more information about regular expressions and how to address specific categories in your category hierarchy.

Data-driven Categories Based on Variables

When you switch the Based on property from Tag to Variable, you can enter a variable expression into the Variables property.

See Variables for information about IMatch variables.

Using variables as the basis for data-driven categories give you an extra set of tools to work with. For example you can create data-driven categories based on IMatch Attributes, or using data not directly accessible via metadata tags.

A word of caution: Data-driven categories based on variables can be very slow to process. IMatch can create metadata-based data-driven categories very efficiently on the database level. Parsing variables is a much more complex process and thus much slower. Variable-based data-driven categories may be 10-100 times slower than normal data-driven categories.

You may want to switch automatic updates off for variable-based categories. And maybe use a File Format filter or Category Filter to reduce the number of files to process.

When To Use Variables

Despite the performance penalty, variable-based data-driven categories may be useful to solve specific problems. You can create data-driven categories based on Attributes, for example. Or combine the values of multiple metadata tags to form the expression used to calculate the data-driven category. Sometimes using a variable together with one or more formatting functions is what you need to automatically group your files.

If you switch the automatic update off for data-driven categories you seldom need, they don't affect performance of IMatch at all.

Example: Data-driven category based on Attributes

In the Attributes help topic we use an Attribute Set for tracking submissions (of images to photo agencies). Each image that has been submitted to one or more agencies has an attribute record with the information about the client, the date and some other data.

To create a data-driven category which shows us the files submitted to each client, we use the variable corresponding to the Client Attribute:

We switch the Based on option to Variable and used the ... button in the Variable row to select the variable we want to use. This opens the standard Variable Selector dialog:

We switch the Automatic Update option to Off. This category is now not automatically updated in the background. We don't need it all the time and we can always refresh it with Shift+F5 when we want to update it.

After refreshing the category manually, we get this result for our sample database:

For each client used in our Attributes, IMatch has created a matching category, with all files submitted to that client. Cool. We can now see immediately which files have been submitted, and to which of our clients.

The Other category created in this example shows us all files which have not been submitted. If you don't need this information, you can improve the performance by switching the Use 'Other' element option to No.

Advanced Variable Features

Data-driven categories support variable expressions consisting of multiple variables, free text and also all variable formatting functions available in IMatch. See the Variables help topic for more information.

Use the Var Toy App to try out your variable expressions.

Properties of Data-driven Categories

You can configure the data-driven category via the Properties panel in the Category View like other categories. Some properties are not available for data-driven categories, e.g., you cannot set a formula.

Child categories of a data-driven category inherit the color settings of their parent category. If you color the data-driven category, all dynamically created child categories will use this color too by default.

Automatic Name Cleaning

Some characters are reserved and cannot be used in category names. This includes the " | @ characters. When creating category names from data values IMatch automatically replaces unsupported characters with an underscore _

Use the Replace Mask feature to replace invalid characters with a character of your liking. You can also use this feature to provide more easily understood names in place of abbreviated names, e.g., 'Motorola' instead of 'MB885'.

Advanced Data-driven Category Features and Options

So far we have covered the most basic features of data-driven categories. These are already sufficient for many cases. But there is much more to discover. Read on.

IMatch data-driven categories support a range of advanced features for data processing, transformation and cleanup. These features can be configured via the Data-driven Category Editor. Expand all property groups in the grid to see all available features and settings.

Chances are that you'll never need many of the advanced features explained below.
But if you have to deal with mixed quality metadata or your files come from different sources, you'll find that the cleanup and filter functions are indispensable.

Basic Settings
EnabledIf this is set to Yes, the level is enabled and participates in the data-driven category generation.
You must have at least one enabled level in your data-driven category.
Automatic UpdateIf this is On, the category is kept up-to-date automatically. IMatch refreshes it in the background when needed, during idle time.
Categories are only refreshed in the background when the corresponding option is on in Edit > Preferences > Background Processing. If this is Off, you need to manually refresh the category using Shift+F5 or the corresponding command from the context menu.

The properties panel in the Category View displays the status of this option. This way you can always tell if a data-driven category is automatically updated or not.

Based on

This option controls whether the category is based on a Metadata Tag or a Variable.
TagThe metadata tag which is used to fill the category level.
To retrieve the data for the category, IMatch looks at all unique (distinct) values for that tag in the database.

Some metadata fields can have thousands or even tens of thousands (!) of different values.
Creating data-driven categories with thousands of child categories may cause a slower performance of IMatch.
Use the Preview dialog to check the data stored for the tag before you let IMatch generate the actual categories.

Variable

Enter the variable expression to use for this category. See Data-driven Categories Based on Variables above for more info.

File Formats

This option enables you to restrict the contents of the data-driven category to files in selected formats. By default, data-driven categories use all files in the database as input. To limit the input to one or more file formats, specify the extensions of these file formats, separated by ; For example, using .mp3 in this field limits the data-driven category to MP3 files only. With .jpg;.tif the data-driven category considers only JPEG and TIFF files.

You can use this feature if you are creating data-driven categories using metadata only available for certain formats. For instance, when you use metadata tags like MP3 artist on the first level. If you process all files, and you enable the Other option Other on the first level does not only include all MP3 files without an artist tag, but also all other files in the database. Restricting the category to MP3 files solves this problem.

Category Filter

This filter allows you to limit the files to consider for this category based on one or more other categories.
See Category Filters above for more information.
LanguageSome metadata tags support multiple languages. If you want to use only data for one language as input for your data-driven category, you can choose that language here. IMatch lists only the languages in use in your database.
Keep empty categoriesThis option is only available for the special @Keyword category. A data-driven category contains one child category for each unique value in the corresponding tag. If the tag value no longer exists, the corresponding category is removed as well when the data-driven category is refreshed.

The @Keyword category has one child category for each keyword used by at least one file in your database. If you remove a keyword from all files, the corresponding category is also removed. This may not be what you expect, because you no longer will be able to add the keyword to files by assigning files to that category. You can just add the keyword to a file in the Keyword Panel and the category will re-appear. But it is often easier to keep empty child categories of @Keywords around. This option allows you to control that.
When you toggle auto grouping on and off, this option can cause duplicate child categories. In this case, temporarily disable this option, refresh the category, then re-enable the option.
Formatting
Data TypeUsually the Automatic setting is best. IMatch uses the metabase to determine the data type of a tag and perform the proper formatting to convert it into a category name. Under some conditions you may want to force IMatch to treat a value as Text, Integer or Real data type. See also Numeric Ranges below.
Use RAW valueBy default, this setting is set to 'No'. In this case IMatch uses the formatted metadata value as returned by ExifTool. This is the correct setting for most tags. For some special purposes, and for tags which have a RAW value, you can set this option to 'Yes'. IMatch then uses the RAW value as contained in the file, without the additional formatting or mapping provided by ExifTool.

Values like EXIF orientation or shutter speed are numerical values. The formatted values like "top-left" or "1/300" are calculated and provided by ExifTool. A formatted shutter speed of "1/125" has a RAW value of 0.008, for example. It sometimes may be useful to use the RAW values directly.
If you enable this, you should set Data Type to Text in  to avoid any additional formatting and to keep the RAW value 'as-is'.
TrimIf you enable this option, IMatch removes blanks (spaces) and tab characters from the beginning and end of each value.
Character MappingIf this is enabled, IMatch maps characters containing diacritics to their base characters. For example: 'hôtel' is mapped to 'hotel' , 'canapé' to 'canape' and 'educación' to 'educacion'
If you have data with variant spellings for the same word, this makes it possible to fold all files into on category. Instead of one category for hôtel and another one for hotel, you now can have only one category for both terms.
Use 'Other' elementIf you work with multi-level data-driven categories, IMatch can generate so-called 'Other' categories to collect data values without data.

In our Make and Model example, you may have images without data in the Make tag. Not all cameras write this tag or the metadata may have been stripped from the file. If your database has 1000 files, only 700 may have information in the Make tag. So, on the Make level you will see only 700 files, split into categories named like "Nikon", "Canon", "Hasselblad" etc..  The remaining 300 files are not accounted for because they don't have data in the Make tag.

If you enable the "Other" option, IMatch creates an additional category named "Other" which holds the missing 300 files without Make metadata. The name of this 'Other' category can be set with the option explained below.

You may also run into this in multi-level data-driven categories. If you have Make on Level 1 and Model on Level 2, there may be files without a Model on level 2. Enabling Other for this level will include these files as well.
Name for 'Other'This property allows you to set the name IMatch uses for the 'Other' category. In our above example you could this change from the generic 'Other' to something with more meaning, e.g. "Images without Make info".

If the name you choose here is part of the actual data, IMatch will take care that the name is unique by adding a number.

Part of ValueIf you want to use only a part of the value found in the database, enable this option and use Start and Length to specify the first character to use and the number of characters to use from each value.
Start and LengthTwo numeric values, separated with a comma. The first number is the index of the first character to use (starting at 1) and the second number specifies the number of characters to use. Set this to 0 to indicate "all remaining characters".

Example

1,5

Start with the first character and use up to 5 characters of each value.
Beach => Beach
Motorway => Motor
Vacation => Vacat

6,0

Start with the 6th character and use all remaining characters.
_DSC_20100923 => 20100923
Motorway => way
Unify
Unify SpellingIf you want to unify the spelling of the data, use one of the options offered by this property.
You can change all values to lower-case, upper-case or first letter upper-case.
Word BoundariesIf you choose the option first letter upper-case, this property controls how IMatch detects word boundaries. Usually this is a space character, sometimes a tab character. Sometimes even a carriage return and linefeed - depending on the type of data you work with. To use multiple word boundary characters, separate them with ;

In order to specify "non-printable" characters, IMatch supports some special keywords for this property:
{tab}  for tabulator character (0x9hex)
{cr} for carriage return (0x13hex)
{lf} for line feed (0x10hex)

Example
{tab}; ;-;{lf}
Add Auto-group
EnabledEnable this option if you want to include an automatic grouping level in your data-driven category.
Usually this option is used when you have only one level in your category and this level has hundreds or even thousands of values.

If auto-grouping is enabled, IMatch uses a part of the value (specified by Start and Length below) to add another level above the level created using the data from the database.

Example

You want to create a data-driven category for keywords (XMP dc:subject tag). Since you have used hundreds of different keywords in your database, you want to group them by the first character for better handling. You enable auto-grouping and use a start and length value of 1,1.
The output produced by IMatch then looks like this:

Your Data-driven Category
|-A
| |-Above
| |-Active
|-B
| |-Baby
| |-Banana
...
|-Z
| |-Zoom

The level above the actual keyword categories is automatically added by IMatch.
Using the first character of each keyword, IMatch creates the categories "A", "B", ... "Z" etc. and below that creates the categories for each keyword starting with that letter.

Adding such an extra level is a great help to further structure data-driven categories with hundreds or even thousands of child-categories.

Start and LengthSpecify the index of the start character (use 1 for the first character) and the number of characters to use for the autogroup.
A good value is usually 1,1 because it limits the number of categories in the autogroup to about 25.
But you can also use something like 1,2 or even 1,3 to autogroup by the first two or three characters in each value.

Use Preview to try different combinations. You can close the Preview dialog box even when IMatch is still retrieving data.

Value Splitting
EnabledEnable this option if the values contain more than one "element". This option instructs IMatch to split the source value into multiple values using a given separator character. The actual child categories are then created from these multiple values.

Example

The keywords in your images are written in multiple languages, and you have used a comma to separate them:

Beach, Strand
Mountain, Berg
Female, Weiblich

This or similar schemes is often used with IPTC because IPTC has no support for multiple languages.

If you enable the value splitting and set the comma (,) as the separator, IMatch will split the above values into

Beach
Strand
Mountain
Berg Female Weiblich

and create the child categories from these values. A file with the keyword "Beach, Strand" will be assigned to both the "Beach" and the "Strand" category.

Repeatable Values

Another use for value splitting are values containing lists of elements, e.g. variables like {File.Persons.Label}:
Tom;Paula;Frank;Susan

Variables with repeatable values return the values separated with a semicolon ;. To split them, you need to enter ~; as the splitting character.

Because the semicolon is used to separate multiple splitting characters, it must be escaped with ~ to use it literally.

SeparatorsEnter the separator used in your values. Separate multiple separators with a semicolon (;).
If you have used ; as the separator in your data, enter ~; to indicate that.
Hierarchy
Detect HierarchiesEnable this option if the metadata values contain hierarchical information and you want to use this to create additional levels in your categories.

For example, if you have used the IMatch 3 script which writes IMatch categories into IPTC keywords, the metadata in your files contains entries like:

Location.Beach.Daytona
Location.Beach
Vehicle.Car

The script writes the full path of each category into the IPTC data and separates the levels with a dot ("dot-notation").

If you use Adobe Lightroom or Adobe Bridge in your workflow, your XMP metadata may contain a proprietary Adobe namespace with (among other data) a dc:hierarchicalKeywords metadata tag. This tag contains the hierarchical tags you have used in these products. These entries look similar to this:

SUBJECT|Location|Beach|Daytona
SUBJECT|Location|Beach
SUBJECT|Transport|Car

The full hierarchy of each tag is included in the entries, using | as the separator.

Use the Preview... button in the Data-driven Category Editor to see what your files contain and which separator was used.

IMatch allows you to use such hierarchical information when you create a data-driven category. IMatch will automatically re-create the hierarchy contained in metadata values, producing as many levels as needed under the data-driven category.

The above examples will produce IMatch category trees like this:

[Your Data-driven Category]
|-Location
  |-Beach
    |-Daytona

or for the Adobe dc:hierarchicalKeywords example:

[Your Data-driven Category]
|-SUBJECT
  |-Location
    |-Beach
      |-Daytona
  |-Transport
    |-Car

There are other schemas used by other software on the market to somehow "include" hierarchical keywords or tags in files. Most of these schemas can be handled easily with the hierarchy detection implemented in IMatch.

Usually you don't need to create your own categories for hierarchical keywords. This is handled automatically by IMatch via the special @Keywords category.

SeparatorsEnter the separator used in the values to separate levels in the hierarchy. Use ; to separate multiple separators.
Replace and Filter
The Replace and Filter features allow you to clean up the values found in the database and limit the data-driven category to values matching a pattern or numerical range.
In addition you can perform replacements to exchange parts of values or entire values, e.g. to remove unwanted characters from the input values or to correct misspellings.

Both the Filter and Replace features rely on regular expressions. See Regular Expressions for detailed information.

Replace MaskThis mask allows you to replace text with other text. Both the source pattern and the replacement text are specified using a regular expression.

Examples

Bech,Beach
replaces all occurrences of "Bech" with "Beach". Ideal to correct typos in the metadata.
You can also use a mask to shorten values before creating categories from it. For example:
^Central Processing Unit,CPU
This mask replaces all occurrences of "Central Processing Unit" at the start of a value with "CPU".

You can also process multiple replacements masks. Just separate them with a ; If a ; is part of your mask, use ~; instead.

.,-;^_DSC,

This mask has two parts. The first part replaces all occurrences of "." with "-".
The second part replaces all occurrences of "_DSC" with nothing. If your values contain digital camera file names, this is a great mask to clean them up.
It is important that you don't add unwanted blanks (spaces) before or after the ; because these will be considered part of the mask. Just add the ; to separate the masks, without any blanks. To replace a string with nothing, use an empty replacement:
_DSC,
replaces all occurrences of _DSC with an empty string, so "_DSC00001.RAW" will become "00001.RAW".
Case-sensitiveThis property controls whether the replacements are case-sensitive.
Filter PatternWith a filter pattern you can restrict the input values to values matching the filter pattern. This allows you to control which values will be included in your data-driven category.

Examples
Nikon;Canon
This mask lets only values pass which contain the text "Nikon" or "Canon".


^Arch.*
This mask filters out all values not starting with "Arch".
Case-sensitiveThis property controls whether the filters are case-sensitive.
Invert FilterSet this to Yes to invert the result of the Filter Pattern. Only values not matching the Filter Pattern will be used to produce the data-driven category.
Numeric RangesIf you force the data type for the values to a numeric value (see Data Type above) you can specify one or more numerical ranges to include only certain values. The format is: lower,upper. You can specify multiple ranges using ; as a separator.

1,10
Let only values pass which are in the range 1 to 10.

-20,-10
Filters out all values below -20 or above -10.

1,100;400,1000
Filters out all values not between 1 and 100 or 400 to 1000.
Invert RangesSet this option to Yes to revert the numeric range. Only values which do not match the numeric range specified above will pass the filter.

Converting Data-driven Categories

You can convert a data-driven category into a normal category using the Convert to normal category command available in the Advanced context menu. This command brings up a dialog box where you can choose how convert the category: