Hackers Can Steal Your AI Model and Use It for Free

inference

2026-06-05 | Source: Dev.to | Original article

AI models are vulnerable to inference theft, a growing threat.

As we reported on June 5, large language models have been making waves in various fields, including medical research and technical writing. However, a new security threat has emerged, putting AI endpoint operators at risk of inference theft. This occurs when an attacker uses someone else's paid AI inference for free, reselling the tokens at a discount. The operator bears the cost per AI call, while the attacker pays nothing. This matters because exposing an AI endpoint can make it the easiest infrastructure to abuse. A single request that triggers a complex model or process can be costly, and attackers can exploit this for their gain. To defend against inference theft and denial-of-wallet attacks, developers can use bot detection, guardrails, cost-aware routing, and budget controls. Looking ahead, it's essential to monitor the development of free LLM inference resources, such as HuggingFace Serverless Inference and OpenCode Zen, which offer curated models and limited free usage. As the use of AI endpoints continues to grow, securing these endpoints against inference theft will become increasingly crucial to prevent financial losses and maintain the integrity of AI services.

Sources

Back to AIPULSEN