Tracked as CVE-2026-4372, the vulnerability affects Transformers versions before 5.3.0 and centres on the handling of the attnimplementationinternal field inside a model’s config. json file. Security researchers found that an attacker could set that field to point to a repository under their control, causing the library to download and execute arbitrary Python code when a victim loaded the model through standard APIs such as frompretrained() or AutoModelForCausalLM. from_pretrained().
The finding is significant because the attack path bypassed the safeguard many developers rely on when handling untrusted models: trustremotecode=False. That setting is widely used to prevent execution of custom code from model repositories. In this case, researchers said the malicious path could run without warnings and without the user explicitly enabling remote code execution, undermining a core assumption in many machine learning security policies.
Hugging Face Transformers is one of the most widely used open-source libraries in artificial intelligence development, supporting PyTorch, TensorFlow and JAX workflows across text, vision, audio and multimodal models. Its reach gives the flaw an unusually large blast radius. The package has accumulated more than 2.2 billion PyPI downloads, attracts tens of millions of downloads each month and is embedded in enterprise AI systems, research environments, cloud notebooks and automated model evaluation pipelines.
The issue was disclosed by Pluto Security researcher Yotam Perkal, who described a scenario in which a single malicious configuration entry could compromise systems loading a model from the Hugging Face Hub. Vulnerable versions were downloaded hundreds of millions of times during the period in which the flaw was present, creating exposure across organisations that routinely test third-party models, run fine-tuning jobs or integrate open-source models into production inference services.
The technical risk comes from the way modern AI workflows blur the line between data and executable software. A model repository may contain weights, tokenisers, configuration files and code required to instantiate an architecture. Developers often treat configuration files as lower-risk metadata, but CVE-2026-4372 demonstrates that configuration-driven loading paths can become execution paths when libraries resolve references dynamically during model initialisation.
The vulnerability lands amid growing scrutiny of AI supply chains. Model hubs have become central to software development because they allow teams to reuse pre-trained systems rather than build them from scratch. That efficiency also creates a trust problem. A malicious or compromised repository can reach automated pipelines, developer laptops and GPU-backed servers if organisations pull models without isolation, pinning, scanning or review.
Academic work on model-hosting ecosystems has warned that unsafe loading practices, custom code hooks and developer confusion over remote execution controls remain widespread. Studies of model hubs have found that malicious payloads may be hidden in model files, dataset loading scripts or framework-specific APIs, with possible outcomes including credential theft, reverse shells, file access, system reconnaissance and lateral movement inside development environments.
For enterprises, the risk is not confined to experimental AI teams. Transformers is frequently used in retrieval systems, customer-service automation, document processing, code assistants, data labelling tools and internal analytics. A successful exploit could run with the privileges of the user or service account loading the model, putting cloud tokens, API keys, proprietary datasets and local files at risk. GPU servers used for model training may also have access to shared storage, build systems and internal networks.
Hugging Face has addressed the issue in Transformers 5.3.0. Security teams are being urged to upgrade immediately, audit environments for older package versions and review any model repositories loaded during the affected period. Organisations running pinned dependencies in containers, notebooks, continuous integration systems or managed machine learning platforms may need separate checks, as those environments often continue using frozen versions long after a patch is available.
Follow Arabian Post
Select Arabian Post as your preferred source on Google and MSN News for trusted business news and Arab politics and updates.