CyberHappenings logo

Track cybersecurity events as they unfold. Sourced timelines. Filter, sort, and browse. Fast, privacy‑respecting. No invasive ads, no tracking.

ShadowMQ Vulnerabilities in AI Inference Frameworks

First reported
Last updated
1 unique sources, 2 articles

Summary

Hide ▲

Researchers have discovered critical remote code execution vulnerabilities in AI inference engines from Meta, Nvidia, Microsoft, and open-source projects like PyTorch. The issue, dubbed ShadowMQ, stems from unsafe use of ZeroMQ (ZMQ) and Python's pickle deserialization, allowing attackers to execute arbitrary code by sending malicious data for deserialization. The vulnerabilities have been found in multiple frameworks, including Llama, TensorRT-LLM, Sarathi-Serve, Modular Max Server, vLLM, and SGLang, with some already patched and others remaining vulnerable. Additionally, three critical security flaws have been disclosed in an open-source utility called Picklescan that could allow malicious actors to execute arbitrary code by loading untrusted PyTorch models, effectively bypassing the tool's protections.

Timeline

  1. 03.12.2025 11:30 1 articles · 23h ago

    Picklescan Vulnerabilities Disclosed and Patched

    Three critical security flaws in Picklescan, an open-source utility designed to scan Python pickle files for malicious content, have been disclosed. These flaws allow attackers to bypass the scanner's protections, execute arbitrary code, and potentially launch supply chain attacks by distributing malicious PyTorch models. The vulnerabilities have been addressed in Picklescan version 0.0.31, released on September 9, 2025.

    Show sources
  2. 14.11.2025 17:20 2 articles · 19d ago

    ShadowMQ Vulnerabilities Discovered in AI Inference Frameworks

    Researchers have uncovered critical remote code execution vulnerabilities in AI inference engines from Meta, Nvidia, Microsoft, and open-source projects like PyTorch. The issue, dubbed ShadowMQ, stems from unsafe use of ZeroMQ (ZMQ) and Python's pickle deserialization, allowing attackers to execute arbitrary code by sending malicious data for deserialization. The vulnerabilities have been found in multiple frameworks, including Llama, TensorRT-LLM, Sarathi-Serve, Modular Max Server, vLLM, and SGLang, with some already patched and others remaining vulnerable. Additionally, three critical security flaws have been disclosed in an open-source utility called Picklescan that could allow malicious actors to execute arbitrary code by loading untrusted PyTorch models, effectively bypassing the tool's protections.

    Show sources

Information Snippets

  • The root cause is a vulnerability in Meta's Llama framework (CVE-2024-50050, CVSS score: 6.3/9.3) involving the use of ZeroMQ's recv_pyobj() method to deserialize incoming data using Python's pickle module.

    First reported: 14.11.2025 17:20
    1 source, 1 article
    Show sources
  • The issue has been addressed in the pyzmq Python library and some affected frameworks, but others remain unpatched or have incomplete fixes.

    First reported: 14.11.2025 17:20
    1 source, 1 article
    Show sources
  • The vulnerabilities have been assigned the following identifiers: CVE-2025-30165 (vLLM), CVE-2025-23254 (NVIDIA TensorRT-LLM), CVE-2025-60455 (Modular Max Server).

    First reported: 14.11.2025 17:20
    1 source, 1 article
    Show sources
  • A successful compromise of a single node could permit an attacker to execute arbitrary code on the cluster, escalate privileges, conduct model theft, and drop malicious payloads like cryptocurrency miners.

    First reported: 14.11.2025 17:20
    1 source, 1 article
    Show sources
  • Three critical security flaws have been disclosed in an open-source utility called Picklescan that could allow malicious actors to execute arbitrary code by loading untrusted PyTorch models, effectively bypassing the tool's protections.

    First reported: 03.12.2025 11:30
    1 source, 1 article
    Show sources
  • Picklescan, developed and maintained by Matthieu Maitre (@mmaitre314), is a security scanner that's designed to parse Python pickle files and detect suspicious imports or function calls, before they are executed.

    First reported: 03.12.2025 11:30
    1 source, 1 article
    Show sources
  • The issues discovered by JFrog essentially make it possible to bypass the scanner, present the scanned model files as safe, and enable malicious code to be executed, which could then pave the way for a supply chain attack.

    First reported: 03.12.2025 11:30
    1 source, 1 article
    Show sources
  • The identified flaws are as follows: CVE-2025-10155 (CVSS score: 9.3/7.8) - A file extension bypass vulnerability that can be used to undermine the scanner and load the model when providing a standard pickle file with a PyTorch-related extension such as .bin or .pt; CVE-2025-10156 (CVSS score: 9.3/7.5) - A bypass vulnerability that can be used to disable ZIP archive scanning by introducing a Cyclic Redundancy Check (CRC) error; CVE-2025-10157 (CVSS score: 9.3/8.3) - A bypass vulnerability that can be used to undermine Picklescan's unsafe globals check, leading to arbitrary code execution by getting around a blocklist of dangerous imports.

    First reported: 03.12.2025 11:30
    1 source, 1 article
    Show sources
  • Following responsible disclosure on June 29, 2025, the three vulnerabilities have been addressed in Picklescan version 0.0.31 released on September 9.

    First reported: 03.12.2025 11:30
    1 source, 1 article
    Show sources
  • The findings illustrate some key systemic issues, including the reliance on a single scanning tool, discrepancies in file-handling behavior between security tools and PyTorch, thereby rendering security architectures vulnerable to attacks.

    First reported: 03.12.2025 11:30
    1 source, 1 article
    Show sources

Similar Happenings

Malware Delivery via Windows Native AI Stack

A security researcher has demonstrated a living-off-the-land (LotL) attack that uses Windows' native AI stack to deliver malware. The attack leverages trusted files from the Open Neural Network Exchange (ONNX) to bypass security engines. The method involves embedding malicious payloads in AI models, which are then loaded and executed using trusted Windows APIs. The attack exploits the inherent trust that Windows and security programs place in ONNX files, making it difficult for security tools to detect the malware. The researcher suggests that security tools need to be reworked to monitor AI files and their associated activities. This technique highlights a new vector for malware delivery, emphasizing the need for enhanced security measures in AI-driven systems.

Model Namespace Reuse Attack Demonstrated on Google, Microsoft AI Platforms

Researchers at Palo Alto Networks have demonstrated a new AI supply chain attack method called Model Namespace Reuse. This method exploits the reuse of model names from deleted or transferred accounts on platforms like Hugging Face. The attack can lead to arbitrary code execution and was successfully demonstrated against Google's Vertex AI and Microsoft's Azure AI Foundry platforms. The attack highlights the risks of relying on model names alone for trust and security. The attack involves registering names associated with deleted or transferred models, allowing threat actors to deploy malicious AI models. Thousands of open-source repositories are potentially vulnerable, including well-known projects. Google, Microsoft, and Hugging Face have been notified, and Google has started daily scans to mitigate the risk.

AI-Powered Offensive Research System Generates Exploits in Minutes

An AI-powered offensive research system, named Auto Exploit, has developed exploits for 14 vulnerabilities in open-source software packages in under 15 minutes. The system uses large language models (LLMs) and CVE advisories to create proof-of-concept exploit code, significantly reducing the time required for exploit development. This advancement highlights the potential impact of full automation on enterprise defenders, who must adapt to vulnerabilities that can be quickly turned into exploits. The system, developed by Israeli cybersecurity researchers, leverages Anthropic's Claude-sonnet-4.0 model to analyze advisories and code patches, generate vulnerable test applications and exploit code, and validate the results. The researchers emphasize that while the approach requires some manual tweaking, it demonstrates the potential for LLMs to accelerate exploit development, posing new challenges for cybersecurity defenses.

Growing threat landscape for AI agents and non-human identities

The rapid adoption of AI agents and non-human identities (NHIs) presents significant security challenges. These entities are increasingly targeted by adversaries, with known attack vectors growing rapidly. The unique characteristics of AI agents, such as autonomy and extensive access, exacerbate these risks. Security experts warn of a closing window of opportunity to secure these tools and data. The threat landscape includes data poisoning, jailbreaking, prompt injection, and the exploitation of abandoned agents. Recent research highlights the potential for malicious proxy settings and zero-click vulnerabilities. Proactive measures are essential to mitigate these risks and build robust defenses.

Zero-click exploit targets AI enterprise agents

AI enterprise agents, integrated with various enterprise environments, are vulnerable to zero-click exploits. Attackers can take over these agents using only a user's email address, gaining access to sensitive data and manipulating users. The exploit affects major AI assistants from Microsoft, Google, OpenAI, Salesforce, and others. Organizations must adopt dedicated security programs to manage ongoing risks associated with AI agents. Current security approaches focusing on prompt injection have proven ineffective. The exploit highlights the need for defense-in-depth strategies and hard boundaries to mitigate risks. Organizations are advised to assume breaches and apply lessons learned from past security challenges.