AI tools are the new avenue for vulnerability

By Frank Catucci, CTO and Head of Security Research, Invicti Security.

  • 2 months ago Posted in

Predictions around AI are getting pretty wild. It seems that many of the more spotlight-hungry personalities within the tech industry, academia and the press possess startling confidence that AI will quickly lead to technological armageddon or utopia.

The reality is that these predictions around AI are so hard to prove or disprove, that huge sweeping statements can be made with relative impunity. The most startling, of course, fill more headlines and get more air time. 

That has largely skewed our vision of the real and present risks of AI. We should be suspicious of anything that sounds too apocalyptic or utopian but there are serious and imminent risks around AI which have much more to do with the basic practical applications of these technologies.

Software Development

Perhaps the best example of exactly this kind of problem is now occurring in software development. We’re in an era of technological innovation and software developers are under unprecedented pressure to produce. That pressure rushes the otherwise-meticulous development process both creating higher likelihood for code vulnerabilities and also forcing developers to skip crucial security steps as they try to meet ever smaller release windows. 

The emergence of Large Language Models (LLMs) like ChatGPT have - in one sense - been a huge benefit. ChatGPT is widely used by developers to handle a lot of the code generation. That speeds the process, but they still produce code vulnerabilities. This problem gets worse when we consider how much scalability these LLMs add to the production process. If developers can now write code at 10x speed, it stands to reason that vulnerabilities could be produced at 10x speed, thus scaling generation of both code and the vulnerabilities within it. The effects of LLMs in software development might be auspicious at first glance. Indeed, they solve one problem but risk aggravating an already endemic security problem within application development. 

Vulnerabilities in AI applications 

There is an even more basic problem to contend with here too: The vulnerabilities within the AI applications that are now proliferating through businesses. 

Hype is a double edged sword. On one hand it often correctly identifies real problems and provides innovative solutions to those problems. On the other hand, it provides perverse incentives to rush software to market. In that din, plenty of mistakes can be made. This opens up the possibility that new AI applications and services come with baked-in vulnerabilities that can leak data or be exploited by an attacker who spots the right vulnerability. Even ChatGPT has suffered breaches. In May 2023, the popular application was attacked through a vulnerability in the Redis Open-Source library which allowed users to see the chat history of other users. 

Manipulating AI applications

OWASP judges the top vulnerability for LLMs to be malicious prompt injection - also known as “jailbreaks” - in which a series of cunning prompts can be used to make the AI or LLM behave in unexpected or malicious ways. In late 2023, a Chevy dealership found that its online chatbot - which was “powered by ChatGPT” - could be manipulated into selling cars for $1. 

In fact, it has already been demonstrated multiple times how prompt injections can help steal data, remotely execute code, manipulate personal CVs and even stock market information. One need not even have much experience with AIs to do so - and people even publish lists of these ‘jailbreaks’ that can effectively manipulate AIs. The scope for abuse is vast and unfortunately they’re hard to resist as the UK’s signals intelligence body, GCHQ, warned last year “there are no surefire mitigations.” Researchers have even devised a worm that specifically targets generative AIs in this way. “Morris II” uses self-replicating, adversarial prompts on generative AIs to poison their output responses and then spread them to other generative AIs. From there, it can spread autonomously with no further input from its controllers. 

This is echoed in the concept of the “adversarial attack” in which an AI’s input data is maliciously manipulated. This could involve installing malicious code on an AI-empowered medical device to make it spit out false results, or altering a physical object in the way of an autonomous vehicle which causes the vehicle to not see it. A recent report from North Carolina State University found these kinds of vulnerabilities deep within Deep Neural Networks (DNNs) which are widely used in AI models around the world. Tianfu Wu, one of the study’s authors, later distilled the threat of such attacks: “Attackers can take advantage of these vulnerabilities to force the AI to interpret the data to be whatever they want. This is incredibly important because if an AI system is not robust against these sorts of attacks, you don't want to put the system into practical use — particularly for applications that can affect human lives.”

The supply chain

The supply chain is also a cause for concern. AIs are trained on large sets of data from a variety of sources, made using tools and frameworks from other sources and then when they’re finally released they’re constantly learning, intaking even more data. Along this multifaceted supply chain, there are multiple potential points of failure and vulnerability.

 

Take the training data for example. These are the large tranches of data which teach an AI model their first lessons and are often of uneven quality. That data may be inaccurate, unclean, contain biases or even outright vulnerabilities which can then be adopted by the AI model, potentially causing problems for its users later down the line. OWASP has flagged training data poisoning as one of the chief risks for LLMs and AIs in which the training data used is maliciously corrupted to manipulate.

Furthermore, the components, libraries and frameworks used to build AIs can also be a cause for concern, introducing new vulnerabilities into the finished product. Late last year, “ShellTorch”, a critical vulnerability was discovered in the TorchServe machine learning framework - maintained by both Amazon and Meta - which allowed attackers to access an AI model’s  proprietary data, to insert their own malicious models into production, to alter model results and even capture entire servers. That’s just one example. Since August, bug bounty hunters have reportedly discovered over a dozen vulnerabilities in commonly used open source tools - such as MLflow or Ray - used to build AI/ML models. These, according to bug bounty program, Huntr, can allow attackers to perpetrate system takeovers and remote code execution on the AI models that those tools help build.

Opacity

AI learns and develops by itself and much of that process is opaque to users, businesses and even its own creators. There are still a huge amount of unknowns to contend with when dealing with AI. Data put in at one end can produce all amounts of hard-to-predict outcomes including faults, mistakes, bad data and of course, security risks. For example, Air Canada was recently forced by a Canadian court to issue a refund to a customer who was given false information by the company’s chatbot resulting in lost air fare. After a small claims complaint, courts decided that the customer was owed a refund because the chatbot offered him false information which led to him making a purchase he might not have otherwise made. 

AI is peculiar for lots of reasons - but part of it is that it's always learning - creating constant risks that an AI model may adversely integrate. The mere unpredictability of its operations forms a serious security risk for many and it must be treated with scepticism and caution. 

An unfortunate reality is that the massive proliferation of one technology often means the parallel proliferation of attack surfaces. We should expect no less from AI, which, in the excitement to adopt, could present multiple potential points of security risk, throughout the supply chain. 

By Andy Mills, VP of EMEA for Cequence Security.
By Frank Baalbergen, Chief Information Security Officer, Mendix.
Anders Brejner, Investment Director and Enabling Solutions Lead at Circularity Capital, discusses...
By Varun Goswami, Head of Product Management, Newgen Software.
By Karl Mattson, Field CISO at Noname Security.
By Kevin Kline, SolarWinds database technology evangelist.
By Tom Printy, Advanced Design & Development Engineer, Zebra Technologies.