Gemma 3 and IMatch Blog Post icon.

Use Google’s Gemma 3 Model for Best AutoTagger Results

Last week, Google released a new AI model named Gemma 3. The AI community quickly made the new model available for Ollama and LM Studio—both of which are supported by IMatch AutoTagger.

What is IMatch AutoTagger

IMatch AutoTagger enables you to automatically create descriptions, keywords, and traits for your image files with the help of AI. Fully integrated into IMatch, AutoTagger allows you to use existing data like persons or location data to provide context and guide the AI—often resulting in significantly improved descriptions and keywords.

For a list of supported AIs in AutoTagger, ranging from Google to Microsoft to Mistral and OpenAI, see AI Service Providers in the IMatch help system. AutoTagger supports many classic and LLM-based AIs and doesn’t force you to use a specific product or vendor. It also supports the AI runners Ollama and LM Studio, which allow you to run powerful AI models on your own computer without incurring costs or privacy concerns.

Using Gemma 3 with Ollama

To use the new Gemma 3 model with IMatch AutoTagger, open a command prompt window by pressing Windows + X, then R, and then run this command to download and install the Gemma 3 model:

ollama run gemma3:4b

Alternatively, if your graphics card has at least 12GB of VRAM, download and install the larger Gemma 3 version for even better results.

ollama run gemma3:12b

Using Gemma 3 with LM Studio

To install Gemma 3 in LM Studio, go to the Discover page and search for “Gemma 3.” Install either the 4B or 12B version, depending on the amount of VRAM available on your graphics card.

Screen shot of LM Studio. We are searching for the Gemma 3 model.

Support for Gemma 3 in IMatch AutoTagger

IMatch 2025.2.4, released on March 19., 2025, already includes definitions for the new Gemma 3 model. You can select the 4B and 12B Gemma model for both Ollama and LM Studio in the AutoTagger settings.

What are the Benefits of Gemma 3 for IMatch AutoTagger?

Compared to previously available models like LLaVA and LLama Vision, our tests show that Gemma 3 produces significantly better descriptions and keywords, demonstrates greater awareness of landmarks and places, and adheres much better to writing style annotations in prompts.

Gemma 3 Supports Multiple Languages

Both the prompt and the responses now support multiple languages. You can now write AutoTagger prompts in your preferred language (instead of English) and/or request that the responses be delivered in a specific language. While not always perfect, the results are generally sufficient for most situations.

Consider this example: we used the same prompt but requested the description, headline, and keywords be in English (left) or German (right). We asked AutoTagger to include the names of the people in the image in the description, utilizing person data available in IMatch. This is one of the many unique features IMatch AutoTagger offers.

A screen shot of an IMatch File Window. Two identical files have been processed by AutoTagger, one using English for the description, headline and keywords, the other using German language.

Gemma 3 Handles More Complex, Structured Prompts

Our initial experiments show that the Gemma 3 model can handle more complex prompts, enabling us to generate, for example, hierarchical keywords. Consider this prompt:

Return up to 20 keywords describing this image.
Create the following keywords:
1. identify the hair color and create a keyword in the following format: "hair color|[hair color description]"
2. identify the eye color and create a keyword in the following format: "eye color|[eye color description]"
3. identify the hair style and create a keyword in the following format: "hair style|[hair style description]"
4. Estimate the age of the person and create a keyword in the following format: "age group[age description]"
5. identify the apparel and create a keyword in the following format: "apparel|[apparel description]"
6. identify the facial expression and create a keyword in the following format: "facial expression|[facial expression]"

This prompt is specifically designed to produce keywords related to a person’s appearance, clothing, and facial expression. It generates hierarchical keywords like hair color|brunette and facial expression|smiling, which are much easier to integrate with your existing thesaurus and controlled vocabulary.

Similarly, you can craft prompts to generate hierarchical keywords for animals or objects depicted in the image, as well as details like lighting, location, and more.

Conclusion

The new Gemma 3 model represents a significant improvement over previously available models for IMatch AutoTagger. Installation is straightforward, and IMatch automatically includes configurations for the new model.

You can expect significantly better results, even without modifying your existing prompts. Explore the possibilities of more complex prompts now enabled and the multilingual capabilities offered by Gemma 3.