Experiments show AI could help to audit smart contracts, but not yet

Share This Post

Artificial intelligence has proven effective at identifying security vulnerabilities, but early tests indicate it won’t be able to replace humans for a while.

While artificial intelligence (AI) has already transformed a myriad of industries, from healthcare and automotive to marketing and finance, its potential is now being put to the test in one of the blockchain industry’s most crucial areas — smart contract security.

Numerous tests have shown great potential for AI-based blockchain audits, but this nascent tech still lacks some important qualities inherent to human professionals — intuition, nuanced judgment and subject expertise.

My own organization, OpenZeppelin, recently conducted a series of experiments highlighting the value of AI in detecting vulnerabilities. This was done using OpenAI’s latest GPT-4 model to identify security issues in Solidity smart contracts. The code being tested comes from the Ethernaut smart contract hacking web game — designed to help auditors learn how to look for exploits. During the experiments, GPT-4 successfully identified vulnerabilities in 20 out of 28 challenges.

Related: Buckle up, Reddit: Closed APIs cost more than you’d expect

In some cases, simply providing the code and asking if the contract contained a vulnerability would produce accurate results, such as with the following naming issue with the constructor function:

ChatGPT analyzes a smart contract. Source: OpenZeppelin

At other times, the results were more mixed or outright poor. Sometimes the AI would need to be prompted with the correct response by providing a somewhat leading question, such as, “Can you change the library address in the previous contract?” At its worst, GPT-4 would fail to come up with a vulnerability, even when things were pretty clearly spelled out, as in, “Gate one and Gate two can be passed if you call the function from inside a constructor, how can you enter the GatekeeperTwo smart contract now?” At one point, the AI even invented a vulnerability that wasn’t actually present.

This highlights the current limitations of this technology. Still, GPT-4 has made notable strides over its predecessor, GPT-3.5, the large language model (LLM) utilized within OpenAI’s initial launch of ChatGPT. In December 2022, experiments with ChatGPT showed that the model could only successfully solve five out of 26 levels. Both GPT-4 and GPT-3.5 were trained on data up until September 2021 using reinforcement learning from human feedback, a technique that involves a human feedback loop to enhance a language model during training.

Coinbase carried out similar experiments, yielding a comparative result. This experiment leveraged ChatGPT to review token security. While the AI was able to mirror manual reviews for a big chunk of smart contracts, it had a hard time providing results for others. Additionally, Coinbase also cited a few instances of ChatGPT labeling high-risk assets as low-risk ones.

Related: Don’t be naive — BlackRock’s ETF won’t be bullish for Bitcoin

It’s important to note that ChatGPT and GPT-4 are LLMs developed for natural language processing, human-like conversations and text generation rather than vulnerability detection. With enough examples of smart contract vulnerabilities, it’s possible for an LLM to acquire the knowledge and patterns necessary to recognize vulnerabilities.

If we want more targeted and reliable solutions for vulnerability detection, however, a machine learning model trained exclusively on high-quality vulnerability data sets would most likely produce superior results. Training data and models customized for specific objectives lead to faster improvements and more accurate results.

For example, the AI team at OpenZeppelin recently built a custom machine learning model to detect reentrancy attacks — a common form of exploit that can occur when smart contracts make external calls to other contracts. Early evaluation results show superior performance compared to industry-leading security tools, with a false positive rate below 1%.

Striking a balance of AI and human expertise

Experiments so far show that while current AI models can be a helpful tool to identify security vulnerabilities, it is unlikely to replace the human security professionals’ nuanced judgment and subject expertise. GPT-4 mainly draws on publicly available data up until 2021 and thus cannot identify complex or unique vulnerabilities beyond the scope of its training data. Given the rapid evolution of blockchain, it’s critical for developers to continue learning about the latest advancements and potential vulnerabilities within the industry.

Looking ahead, the future of smart contract security will likely involve collaboration between human expertise and constantly improving AI tools. The most effective defense against AI-armed cybercriminals will be using AI to identify the most common and well-known vulnerabilities while human experts keep up with the latest advances and update AI solutions accordingly. Beyond the cybersecurity realm, the combined efforts of AI and blockchain will have many more positive and groundbreaking solutions.

AI alone won’t replace humans. However, human auditors who learn to leverage AI tools will be much more effective than auditors turning a blind eye to this emerging technology.

Mariko Wakabayashi is the machine learning lead at OpenZeppelin. She is responsible for applied AI/ML and data initiatives at OpenZeppelin and the Forta Network. Mariko created Forta Network’’s public API and led data-sharing and open-source projects. Her AI system at Forta has detected over $300 million in blockchain hacks in real time before they occurred.

This article is for general information purposes and is not intended to be and should not be taken as legal or investment advice. The views, thoughts and opinions expressed here are the author’s alone and do not necessarily reflect or represent the views and opinions of Cointelegraph.

Read Entire Article
spot_img

Related Posts

Render Revving Up: Analyst Predicts Potential Climb To $16

Render (RNDR), the cloud-based rendering network, is stirring excitement in the crypto market with technical indicators and analyst predictions hinting at a substantial price surge in the coming

‘Globalist Power Is No Longer a Moral Authority’: Amir Taaki Responds to Samourai Charges

In an era increasingly defined by surveillance and oversight, Amir Taaki stands out as a key figure in the Bitcoin movement, sounding a clarion call for crypto awareness following the indictment of

DOJ Counters Tornado Cash Developer’s Motion To Dismiss – Details

The US Department of Justice has released a rebuttal to Tornado Cash developer Roman Storm’s motion to dismiss the criminal charges levied against him by the government In a filing submitted on

Ace Exchange Suspects Should Get 20-Year Prison Sentences: Prosecutors

Prosecutors in Taiwan have proposed lengthy prison sentences for ACE Exchange’s founder, David Pan, and former executives as main suspects in an alleged fraud and money laundering case This

Liquid Staking Platforms See 60,000 ETH Outflow in 2 Weeks; Lido Dominates Reductions

In the last two weeks, liquid staking derivative (LSD) protocols have experienced a decline of 60,000 ether valued at over $198 million, with Lido accounting for 40,000 of the ETH withdrawn LSD

Phoenix Wallet To Disable Services In the US – Here’s Why

Phoenix Wallet, a Bitcoin wallet provider for Lightning Network payments, has announced its impending removal from US application stores This development comes as the wallet’s founding company,
- Advertisement -spot_img