A vulnerability tracked as CVE-2024-37032 and dubbed “Probllama” that affects Ollama, a popular open-source project for running large language models (LLMs), was disclosed on May 5. According to a recent Wiz Inc. report, the flaw can lead to remote code execution (RCE). As of June 10, it already had a patch, but over 1,000 instances that remain exposed to the Internet are vulnerable.
Ollama is used on the command line or via a REST API to infer with compatible neural networks like Meta's Llama family, Microsoft's Phi clan, and Mistral models. It sees hundreds of thousands of pulls per month on Docker Hub.
The security researchers said insufficient validation on the server side of the Ollama REST API enables a potential attacker to send specially crafted HTTP requests to the Ollama API server, which is publicly exposed in Docker installations.
The Ollama server’s multiple API endpoints perform core functions, including the API endpoint ‘/api/pull’ for downloading private registries and Ollama models.
The process prompting the download of the latter enabled attackers to potentially compromise the environment hosting a vulnerable Ollama server by supplying a malicious manifest file with a path traversal payload in the digest field.
The payload could corrupt system files, arbitrary file reads, and RCE to compromise that system. The server runs with root privileges and listens on 0.0.0.0 by default in Docker installations, which permits abusing this flaw remotely.
Its maintainers fixed the 'CVE-2024-37032' flaw in version 0.1.34 released via GitHub, and Ollama users should protect their AI applications by performing the update, putting them behind firewalls, and enabling authentication when exposed to the Internet.