Just in: Google Gemma 3 models for AutoTagger

Started by Mario, Today at 03:19:33 PM

Previous topic - Next topic

Mario

A few hours ago, Google released their latest Gemma 3 model publicly. This model is based on Gemini 2.
See Google's blog post for more information.

Ollama added support for it maybe a couple of hours ago. See https://ollama.com/library/gemma3 for more info.
Ollama offers Gemma in several sizes. I've added support for the 4b and 12b versions to AutoTagger.

The 27b Model needs at least 30 GB RAM on the graphic card and Ollama allocates over 100GB (not MB!) of RAM trying to run it on my computer. This was kinda scary because my PC became so busy swapping memory to disk that it came to a standstill - even the mouse cursor took seconds to move :-\
I decided not to include a configuration for this size :)

This is the first Google model supported by AutoTagger, because this is the first I'm aware of of that is multimodal, aka it understands images and does not need a data center graphic card.


The 4b version needs about 5 GB RAM on the graphic card and should run fine on "normal" graphic cards.
For the 12b version, a 16 GB graphic card is required (or a lot more time).

What makes this model stand out, after my initial tests this hour, is that it supports 140 different languages?!
Asking it to produce descriptions and keywords in German produces much better results than LLaVA or the LLama Vision model.

The Gemma model also seems to know more about places and landmarks, despite having been condensed down to 12b or 4b parameters.

I will include an updated configuration file for AutoTagger with the next IMatch release.
If you want it right now, let me know and I'll attach it and post instructions.

rienvanham

Hi Mario,

It would be fantastic if the AutoTagger could produce keywords in the Dutch language. Off course I can wait until the next release. Do you also offer a kind of manual how to configure this?
I have an 3090 nVidia card with 24 GB of memory.

Thanks on forehand,

Rien.

mastodon

Great! I only have a very old card with only 2 Gb memeory, but I am considering a new PC. This a good hint, what to buy.

Mario

#3
Installation works the same for all Ollama models. See Ollama in the IMatch help. There is also a video tutorial I made.

A 3090 with 24GB VRAM will work very nicely with the 12b version of Gemma.

I've made a quick text with the default prompts shown in the help for description and keyword, just adding the "in Dutch" phrase to the prompt. This is what I got from Gemma 12b. Any good?

Image3.jpg

Mario

Quote from: mastodon on Today at 07:24:14 PMGreat! I only have a very old card with only 2 Gb memeory, but I am considering a new PC. This a good hint, what to buy.
Get the biggest NVIDIA card you can afford. 16 VRAM with a 256 bit memory bus is ideal for fast local AI.

I assume that models will get smaller and require less compute (aka GPU power) over time, but, for now, having enough VRAM on the graphic card to hold the entire model in memory is essential for good performance.

My 4060TI has 16GB but only a 128 bit memory bus. There was a 400$ difference between this card and the next higher card with a 256 bit memory but. I wanted to have local AI, but not that badly ;D

Smaller models like the Gemma 4b or the LLaVA 7b run OK even on the 3060 mobile with 8GB in my laptop. Very usable.
Larger models are better, but also require more VRAM on the GPU - and that's costly. For 500$ you can buy a lot of cloud-based AI from OpenAI or Mistral. Always a point to consider.

A new move is started by AMD (and I'm sure Intel at some point in time). They seem to be switching to a unified memory model (like Apple uses for the M* chips), where CPU, GPU (graphic card) and NPU (neural processing unit for AI) share common (very fast) RAM. For example: https://www.techspot.com/news/106238-amd-ryzen-ai-max-brings-monstrous-40-core.html where the powerful CPU has 128 GB of RAM to share with the graphic card. This is good for large AI models, if the graphic unit can compete with NVIDIA.

sybersitizen

Thats good news!

Quote from: Mario on Today at 03:19:33 PMThe 4b version needs about 5 GB RAM on the graphic card and should run fine on "normal" graphic cards.

I wonder if my 4GB card would be okay. If the answer is maybe, I'll try it when the IMatch is updated.

Mario

Quote from: sybersitizen on Today at 08:02:57 PMI wonder if my 4GB card would be okay. If the answer is maybe, I'll try it when the IMatch is updated
Usually 1GB or more of VRAM are already used by Windows, Desktop Manager (dwm,exe) etc. Check in Windows Task Manager to see how much of the VRAM is actually available.

Just try it. Cannot harm.

Ollama automatically swaps to normal RAM when it cannot fit the entire model into VRAM. And some of the modern models use a "mix of expert", where parts of the model are not needed for specific tasks, and when they are not in VRAM it does not matter.