Fine-Tuning Gemma 3 with Cloud Run Jobs: Serverless GPUs (NVIDIA RTX 6000 Pro) for pet breed classification đđ
fine-tuning gemma google nvidia
| Source: Dev.to | Original article
Google Cloud has rolled out serverâless GPU support on CloudâŻRunâŻJobs, letting developers fineâtune large language models without provisioning dedicated instances. The first public showcase uses the new NVIDIA RTXâŻ6000âŻPro (Blackwell) cards to adapt the 27âbillionâparameter GemmaâŻ3 model for a petâbreed classification task, turning a generic LLM into a specialist imageâandâtext recogniser for cats and dogs.
The workflow, posted by a community engineer, spins up a CloudâŻRun job that automatically provisions an RTXâŻ6000âŻPro, pulls the GemmaâŻ3 weights, and runs a QLoRAâstyle fineâtuning loop on a curated dataset of pet images and breed labels. Payâperâsecond billing, instant scaling to zero and a 19âsecond coldâstart for the 4âbillionâparameter variant mean the entire experiment costs only a few dollars and can be reproduced on demand. No quota request is required for the L4âclass GPUs that power the service, lowering the barrier for small teams and hobbyists.
Why it matters is twofold. First, it democratizes access to highâend GPU resources, a longâstanding bottleneck for Nordic startups and research groups that lack onâpremise clusters. Second, it signals Googleâs push to position CloudâŻRun as a viable alternative to VertexâŻAI for custom model work, directly competing with AWSâŻSageMakerâŻServerless and AzureâŻMLâs managed compute. By coupling openâsource Gemma modelsâfirst highlighted in our AprilâŻ9 coverage of GemmaâŻ4âwith truly serverâless hardware, Google is closing the gap between model availability and practical, lowâcost deployment.
Looking ahead, the community will likely test the same pipeline on the newer GemmaâŻ4 family and on larger GPU types as they become serverâless. Watch for benchmark releases comparing cost and latency against traditional VMâbased fineâtuning, and for tighter integration with tools such as Unsloth and HuggingâŻFaceâs TRL, which could further accelerate niche AI applications across the Nordics.
Sources
Back to AIPULSEN