
It may sound like hyperbole to say that machine learning operations (MLOps) have become the backbone of our digital future, but it’s actually true. Similar to how we view energy grids or transportation systems as part of the critical infrastructure that powers society, AI/ML software and capabilities are quickly becoming essential technology for a wide range of companies, industries, and citizen services.
With artificial intelligence (AI) and machine learning (ML) rapidly transforming industries, we’ve also seen the rise of a new age of “Shadow IT” now referred to as “Shadow ML,” wherein AI agents and technologies are being used by employees without the knowledge or approval of the IT department, outside the company’s sanctioned systems, potentially creating significant security risks due to lack of oversight and control over data and access.
Thus, understanding the evolving role MLOps plays in managing and securing the rapidly expanding AI/ML IT landscape is essential to safeguarding the interconnected systems that define our era.
Software is an omnipresent component of our day-to-day lives, operating quietly but indispensably behind the scenes. For that reason, failures in these systems are often hard to detect, can happen at any moment, and spread quickly across the globe, disrupting businesses, upsetting economies, undermining governments or even endangering lives.
The stakes are even more significant as AI and ML technologies increasingly take center stage. Traditional software operations are giving way to AI-driven systems capable of decision-making, prediction, and automation at unprecedented scale. However, like any technology that ushers in new but immense potential, AI and ML also introduce new complexities and risks, elevating the importance and need for strong MLOps security. As reliance on AI/ML grows, the robustness of MLOps security becomes foundational to fending off evolving cyber threats.
The lifecycle of building and deploying ML models is filled with both complexity and opportunity. At its core, these processes include:
It’s a structured approach, but one with significant vulnerabilities that threaten stability and security. These vulnerabilities can be broadly categorised as inherent and implementation-related.
Inherent vulnerabilities include the complexity of ML environments, including cloud services and open-source tools, can create security gaps that may be exploited and include:
Implementation vulnerabilities include:
While AI and ML can offer enormous benefits for organisations, it’s crucial not to prioritise rapid development over security. Doing so could compromise ML models and put organisations at risk.
Recognising and addressing these vulnerabilities is crucial to ensuring MLOps platforms remain trustworthy components of our digital infrastructure. In a recent example, a flagged PyTorch model, previously uploaded by a now-deleted account, was capable of allowing attackers to inject arbitrary Python code into critical processes upon loading. The method used to load PyTorch models, specifically the torch.load() function, can be a vector for code execution vulnerabilities, especially when models are trained with Hugging Face’s Transformers library.
The "pickle" format, often used for serialising Python objects, poses a particular risk as it can execute arbitrary code when loaded, making it vulnerable to exploitation. This scenario underscores a broader risk in the ML ecosystem. Many widely used ML model formats support code-execution-on-load, a feature meant to create efficient functionality, but also introduces significant security vulnerabilities. An attacker controlling a model registry could insert backdoors into models, enabling unauthorised and instant code execution when the models are deployed or loaded.
For this reason, developers must exercise caution when loading models from public repositories, ensuring they validate the source and potential risks associated with the model files. Robust input validation, restricted access, and continuous vulnerability assessments are critical to mitigating risks and ensuring the secure deployment of machine learning solutions.
There are many other vulnerabilities across the MLOps pipeline, underscoring the importance of vigilance among teams. Many separate elements within a model serve as potential attack vectors, which organisations typically manage and secure. Therefore, implementing standard APIs for artefact access and ensuring seamless integration of security tools across various ML platforms for data scientists, machine learning engineers, and core development teams is essential. Key security considerations for MLOps development should include:
By adhering to these best practices, organisations can effectively safeguard MLOps pipelines and ensure that security measures enhance rather than impede the development and deployment of ML models.
As we move further into an AI-driven future, the resilience of the MLOps infrastructure will become an increasingly key component to maintaining the trust, reliability, and security of the digital systems that power the world.
Shachar Menashe is VP of Security Research at JFrog
Main image courtesy of iStockPhoto.com and hirun
© 2025, Lyonsdown Limited. teiss® is a registered trademark of Lyonsdown Ltd. VAT registration number: 830519543