Breaking Digital Barriers: The Promise and Challenges of Agentic AI
Can AI agents solve our digital interoperability problem?
At January’s Consumer Electronics Show, NVIDIA CEO Jensen Huang boldly proclaimed, “the Age of AI agentics is here.” Just days before, OpenAI’s Sam Altman echoed the chipmaker, blogging “We [OpenAI] believe that, in 2025, we may see the first AI agents “join the workforce” and materially change the output of companies.” AI agents are back in vogue.
For the non-technical, “agents” or “agentic AI” refers to AI equipped with not only intelligence, but the ability to autonomously solve complex, multi-step problems. In short, very smart bots.
For years there have been attempts to create workable agents, yet hype has always outpaced reality. Following the release of ChatGPT came a boom of excitement, yielding frameworks like babyAGI, which were unreliable and used tech that failed often. Today, products, such as Salesforce’s Agentforce and Google’s Agentspace, offer modest improvements enabling agents to field customer service queries and automate business tasks. Still, they only work in highly structured environments and are limited to preapproved tasks.
In October, it looked like the landscape might be changing. Anthropic offered an early glimpse at a more evolved agentic technology trained to complete unstructured tasks with a standard mouse and computer interface, just as a human might. They call this ability “computer use.”
When asked in a demo to plan a San Francisco daytrip, Anthropic’s system deftly orchestrates a complex process. It moves the mouse between applications, searching Google for activities, using mapping features, and even timing the scheduled departure to see the sunrise. This functionality is an astounding game changer. Finally, we have AI that can seemingly, freely, do things. For the first time, capable agents seem remotely possible.
The Interoperability Challenge
Excitement is warranted. Not just because agents are closer, but because the problems they could solve are profound. In computer use, I see a solution to perhaps the greatest unsolved digital quagmire: the interoperability problem. This is the challenge of getting disparate systems and software to seamlessly work together, talk to each other, and collaborate.
If you’ve video conferenced, you’ve felt this problem. The coordination of competing microphones, cameras, calendars, speakers, and software products is so challenging that calls still start with confirmation that things are actually working: “Can you hear me ok?”
This benign question belies a greater challenge. Today, most technologies are tightly guarded fiefdoms. While integrating multiple apps is possible on a case-by-case basis, the costs and headaches involved often don’t justify the outcome. Rather than offering seamless digital workflows, technology relies on the flexibility and situational awareness of human intelligence to knit everything together. Humans are the default interface.
The Limits of the Human Interface
Playing interface role has never been easy, but we’ve muddled through, and our performance has been good enough. Digital systems, however, are rapidly proliferating. IoT Analytics estimates that in 2024 alone the number of connected devices grew by 13%. This represents 2.2 billion new systems coming online. With each new device, interfacing demands multiply, and new workflows further strain the humans juggling it all. In the face of this digital Cambrian explosion, our human-led interfacing approach has reached its limit, and even good enough isn’t possible. Without automated help, things are starting to fail.
The primary cause of failure is the growing complexity of processes due to digital fragmentation. With more systems, come more variables; with more variables come more points of failure.
In cybersecurity this is meaningfully felt. In an October Weforum post, Checkpoint CTO Dorit Dor notes digital “complexity is the enemy of security.” As organizations have implemented more and more systems, they have simultaneously bolted a mishmash of cybersecurity solutions with “severely limited interoperability” onto their IT stack. Dor’s reported result: security must often be managed through “separate consoles, creating a mess of complexity for security teams to navigate to achieve complete security workflows. He concludes “[b]lind spots are rampant, opening the door for attackers.”
Runaway complexity has consequences. Per a recent Checkpoint report, Q3 2024 saw a record 75% increase in attacks per organization, a figure that builds on a consistent pattern of double-digit growth in attack volume every quarter throughout the 2020s. Measured in GDP, the $9.5 trillion global cybercrime economy is now larger than all but two national economies. This astounding economic drag owes much to human interfacing limitations.
Beyond cybersecurity, such complexity and fragmentation can also create dire healthcare consequences. In a wide-ranging expose on the sorry state of healthcare tech policy, Fred Schulte and Erika Fry report that “Unlike, say, with the global network of ATMs, the proprietary EHR [electronic health records] systems made by more than 700 vendors routinely don’t talk to one another, meaning that doctors still resort to transferring medical data via fax and CD-ROM. Patients, meanwhile, still struggle to access their own records — and, sometimes, just plain can’t.” This mess deeply challenges both the limited medical workforce, and the frustrated patients forced to pick up the slack. It can even cause harm. The expose recounts a shocking 2017 case where a computer’s failure to transmit a positive infection result led an untreated patient to suffer irreversible brain damage. When limited humans are tasked to wade through a complex sea of system interfaces and alerts, discovering such failures becomes a function of simple luck. Harm is inevitable.
Unfortunately, the cost of ‘fixing’ this fragmentation is prohibitive. In the health sector, the National Academy of Medicine states, “The cost of medical device integration, for example, integrating ventilators and physiologic monitors to the EHR, was estimated at as much as $6,500 to $10,000 per bed in one-time costs, plus as much as 15 percent in annual maintenance fees.” With such staggering fees and failures, it is no wonder healthcare costs are rising while outcomes continue to fail.
The Ultimate Interface?
In Anthropic’s computer use, I see the beginnings of a solution. In the demonstrations of Claude, the presenters were able to seamlessly execute a workflow that passed data and instructions between Google Chrome, Apple Maps, Google Search, independent blogs, and Apple calendars—a range of applications simply not designed to talk to each other. Yet suddenly they were.
Computer-use-powered agents could soon become the ultimate interface, enabling us to manage digital proliferation. Discussing how the growing complexity of systems increases the threat to cybersecurity, Checkpoint’s Doroit Dor concedes that “the scale of the threat is too large to confront without AI as a force multiplier.” With this computer-use proof of concept, one can start to imagine exactly what that AI solution might mean: knitting together an array of cyber monitoring systems and applications to reduce complexity while increasing threat visibility.
Agents capable of computer use would also be of value in healthcare. As a former healthcare IT administrator, I can say firsthand that computer-use enabled agents could solve many of our interoperability woes. Interfacing healthcare tech is a heavily digital task that often takes place on a desktop. A major interfacing constraint is the need for humans to update settings, catch alerts, verify billing, transmit test results, resolve failures, and correctly interpret unwieldy documentation. Capable computer-use agents could manage these administrative tasks cheaply, at scale, and in real time.1
Can we have nice things?
Naturally, the more agents improve the more glittering use cases we can consider from automated legal services to automated financial planning. Even accessible government services might be within reach. To get there, however, agents face a battery of challenges.
Capability
There remain major technical challenges to highly capable agents becoming reality. Despite showing latent promise, Anthropic’s computer use powered agents still take dozens of costly minutes to perform simple tasks, and obstacles such as drag-and-drop elements can completely derail their agent’s workflow. On top of increasing raw intelligence, engineers must solve countless other tricky design challenges. For instance, when multiple agents are in use on a single network, rules of the road will be needed to resolve messy conflicts. These engineering hurdles are major and will probably be harder to solve than some predict.
Access
Access to required systems, websites and software is a thornier problem. Once deployed, agents will be bound by the legacy restrictions built into our digital environments such as, importantly, authentication. For agentic success, essential questions must be answered, such as, “if my multi-factor app is on my phone, how can my agent access my account?” And “how can agents securely store and use passwords?” If agents can’t log in, they can’t function.
The web’s countless layers of anti-bot technology will also cause systemic access problems. At present, 50% of web traffic is already non-human, and the web has responded with remedies such as Captchas, rate limits, honeypots, and other tools. Success will require that agents overcome existing roadblocks while preserving the basic protections that make the web safe for humans.
Robustness
A third challenge will be ensuring agents maintain robust security and robust performance. Traversing digital ecosystems, agents must be resistant to security challenges, including prompt injection attacks (prompts designed to cause AI to misbehave or malfunction), adversarial intelligent agents, and general malware.
Perhaps more insidious than outright hacking will be the inevitable rise of agentic dark patterns—subtle web designs geared to game agent functionality and lure agents towards certain clicks, outcomes, activities, or products. Just as design elements, such as endless scrolling, shape human engagement, similar patterns will twist and shape agentic behavior. When connected to the web, a successful agent will require considerable digital street smarts to robustly complete tasks.
Legal
Legal hurdles follow new digital capabilities. A first-order problem will be the countless user agreements and contracts that ban automated access across the web. Facebook’s agreement, for instance, states:
“You may not access or collect data from our Products using automated means (without our prior permission) or attempt to access data you do not have permission to access.”
Out of the gate, any agentic system looking to surf the web faces a legal minefield.
In addition to user agreements, a range of “AI laws” are being developed. States, including Colorado, have passed comprehensive AI regulations. Others, like California, are extending existing privacy regulations to curtail automated decisionmakers like agents. These regulations include difficult-to-implement rights for users such as the right to opt out of automated, agentic processes. An unwieldy patchwork of restrictions is forming.
Obviously, neither user agreements nor laws effectively block digital activity. The widespread use of torrents in the 2000s to pirate music illustrates the basic failure of text to constrain digital will. Unfortunately, what legal terms can constrain is above-board actors. To get products up and running without facing an improper lawsuit over automated access or incurring an accidental violation of defined consumer’s rights, developers will have to proceed with exceeding care. In the process, horizons will be limited.
Pushing Forward
Unfortunately, most of these challenges lack a silver bullet. Capability improvements and the ability to overcome messy legal patchworks are long-run challenges that will require sustained efforts across multiple disciplines to overcome. Meanwhile, the challenges of agentic dark patterns and cybersecurity are here to stay and can only be mitigated through continuous patches, not solved.
As for the challenges system access/authorization and user agreements pose—here I’m more optimistic. Both standards-setting bodies and perhaps public sector investments could potentially offer solutions.
Starting with user agreement and permissions challenges. Private corporations and standards setters must immediately recognize that soon they will not be able to assume that the digital technology user base is human. Accepting this new reality, these bodies will immediately need to craft standard definitions and boilerplates of agreements that ease restrictions on acceptable use.
New digital Infrastructure will also be needed to enable agents to confidently use websites and systems. As a model, look to the ‘noindex’ HTTP tag. This tag is web infrastructure embedded in websites to guide search engines, telling them clearly whether they have permission to display certain web content. We should create similar guideposts for agents. Industry bodies should develop and implement standard ‘noagent’ web and system indicators to guide which sites and systems allow agents, ideally with options to enable fine-grained permissions. While such basic infrastructure wouldn’t stop misuse, it would provide much-needed clarity of permissions for those trying to follow rules and terms of use. Beyond this basic suggestion, bodies must consider further tools to foster clear and ordered digital environments so positive uses of agents can confidently flourish.
As for authentication, investments in both technology and standards are needed. To this end, the Trump Administration should consider directing already appropriated AI R&D funding and the joint research efforts of the International Network of AI Safety Institutes toward developing methods for agents to safely store and use passwords on behalf of human users, as well as creating agent-compatible multifactor authentication processes.
Beyond research, bodies like the National Institute of Standards and Technology should review existing user access and security standards and make any necessary updates to accommodate agentic technology. If successful, these investments would go a long way towards making AI agents viable while also ensuring cybersecurity is baked into this industry from its birth.
Conclusion
These steps may be modest, but they are both actionable and fit for this nascent technological stage. Developing access technologies and agent-accommodating user agreements and infrastructure would represent clear progress towards enabling confident AI agent use and fostering much-needed trust in a bot-skeptical public. This work is more than essential. Increasing healthcare and cybersecurity risks show the need for better interfaces has shifted from desirable to critical. Automation is required to manage digital complexity, and agents represent a potential solution. To realize possibility, however, raw intelligence alone is not enough. We need to also create a permissive environment. Let’s get to work.