With the rise of AI-powered code generation, particularly tools like ChatGPT, there’s been a lot of buzz around the idea that “AI will soon replace developers.” While AI can certainly help speed up development, the reality is that using AI-generated code can introduce more risk than it mitigates.
A growing concern is the resurgence of supply chain attacks, particularly slopsquatting. This is when attackers publish malicious packages or libraries that closely mimic legitimate ones—sometimes even imitating well-known libraries that were once trusted, but now execute harmful code.
Here’s the issue:
To avoid reinventing the wheel, developers rely heavily on external libraries to reuse code across projects. By including a simple import statement in the code, we can leverage external libraries to perform specific functions. This system is efficient, but it’s not without its vulnerabilities.
AI, when generating code, sometimes refers to non-existent libraries. This opens the door for malicious actors to register libraries with similar names, embedding harmful code. What’s concerning is that recent studies have shown AI models generate non-existent libraries around 20% of the time in generated code samples for languages like Python and JavaScript. While we don’t have concrete evidence of this attack being widely exploited yet, it’s a clear risk that shouldn’t be ignored.
This also recalls previous incidents where attackers gained control over once-reliable libraries—ones that had gained widespread adoption—and used those vulnerabilities to exploit major companies, including the likes of Microsoft, by compromising their build environments.
The bottom line: While AI can be a powerful tool, manual code review remains absolutely essential. As AI-generated code continues to grow in popularity, developers must remain vigilant, especially when dealing with external libraries and dependencies. Being cautious could save you from the risk of infrastructure compromise.
Report: https://arxiv.org/pdf/2406.10279