AI Agents: Why They're Not Living Up to the Hype

AI agents, powered by large language models, were supposed to be a big deal. They were meant to plan and carry out tasks on their own. But so far, they haven't lived up to the hype. Many companies are struggling to figure out how to use them effectively. One big problem is that AI agents aren't very good at handling complex tasks. They also have trouble working with existing systems and other AI agents. Plus, there are concerns about data safety and privacy. Some companies are even waiting to see how these agents perform before investing in them. Researchers have found that even the best AI agents, like Google's Gemini 2. 5 Pro, fail to complete real-world office tasks most of the time. OpenAI's GPT-4o and Meta's Llama-3. 1-405b have even higher failure rates. This shows that AI agents just aren't ready for prime time yet. While AI agents aren't living up to expectations, they have found one useful application: finding and exploiting vulnerabilities in crypto projects. Researchers at the University of Sydney and University College London created an AI agent named A1 that can discover and exploit bugs in blockchain smart contracts. These bugs can be used to steal money, and the crypto industry has lost billions to such hacks.

A1 is more reliable than other AI agents. It demonstrated a success rate of nearly 63% on the Verite benchmark when tested on real-world vulnerable contracts. A1 can generate actual executable code, making it similar to a human hacker. However, the creators of A1 have decided not to release it as open source, fearing it could be misused by criminals. Despite the struggles of AI agents, the AI market is still booming. Nvidia, whose GPUs power many AI ambitions, recently reached a market cap of $4 trillion. Other big tech companies are also investing heavily in AI, but not all of them are seeing concrete results in terms of profitability. Overall, AI agents are still in their early stages. While they have disappointed some with their progress, there's hope that they will improve with time. As the technology gets better, we may see AI agents perform much better than they do now.

actions