Approaches to Regulating Artificial Intelligence: A Primer
Beyond The Gang Of Four: Practical Design Patterns For Modern AI Systems
Key TakeawaysThe Gang of Four's 23 object-oriented patterns shaped how an entire generation of developers designed software. In the 2010s, cloud computing introduced patterns like publish-subscribe ("pub-sub"), microservices, event-driven workflows, and serverless models that now power most cloud based distributed-systems.
Similarly, before the current AI boom, the machine learning community had already developed "ML design patterns". When you build and deploy ML models, you face specific challenges, and patterns like Checkpointing, Feature-Stores, and Versioning have become standard practice.
Why should you care about these patterns? They help you solve known problems in standardized ways. Instead of reinventing solutions, you use a shared vocabulary. When you say "Singleton", "Pub-Sub", or "Feature Store", your team immediately understands your approach. This speeds up your development, reduces errors, and makes your systems easier to maintain.
Modern AI systems bring new challenges that neither classic software nor conventional ML patterns fully address.
For example, how do you guide model output and prevent misleading content? How do you build user experiences that help users understand, trust, and effectively use AI-powered applications? How do you manage agent interactions in multi-agent systems? How do you reduce compute costs to make your product sustainable?
Figure 1: An illustration of a well-architected modern AI-based system
To help develop a well-architected AI system as shown in Figure 1, many AI patterns have emerged across the industry. In this article, I won't invent new design patterns. Instead, I'll show you how existing patterns fit together. I organize key emerging patterns into five categories that build on each other as you scale your AI system.
I specifically cover best practices for building user-facing AI applications using existing models, mainly accessed through API calls. While I focus on text-based interactions, you can also apply these patterns across multimodal applications. However, I deliberately don't address model training, customization, hosting, or model optimization since these typically fall outside the workflow of developers using API-based AI models. I also don't cover agentic AI systems or patterns for multi-agent interactions, as these topics deserve their own dedicated discussions.
Prompting and Context PatternsUnlike traditional software, where you explicitly code system behavior, in modern AI systems, behavior heavily depends on the instructions and context you provide to large language models (LLMs) or large multimodal models (LMMs). To create effective AI outputs, you must provide effective guidance, either by crafting precise prompts and/or supplying relevant context (or external knowledge) directly within your prompt.
Prompting might seem trivial at first. After all, you send free-form text to a model, so what could go wrong? However, how you phrase a prompt and what context you provide can drastically change your model's behavior, and there's no compiler to catch errors or a standard library of techniques. Creating prompts that reliably and consistently produce your desired behavior becomes difficult, especially as tasks grow more complex.
If you use prompting and context patterns effectively, you can improve the model's reasoning, accuracy, consistency, and adherence to instructions. Equally important, you can create reusable prompts that generalize across models, tasks, and domains.
Let's examine four specific prompting patterns that will help you standardize and refine your approach:
Table 1: Prompting Issues and When to Apply Each Pattern
Few-Shot Prompting PatternFew-Shot Prompting is one of the most straightforward yet powerful prompting approaches. Without examples, your model might generate inconsistent outputs, struggle with task ambiguity, or fail to meet your specific requirements. You can solve this problem by providing the model with a handful of examples (input-output pairs) in the prompt and then providing the actual input. You are essentially providing training data on the fly. This allows the model to generalize without re-training or fine-tuning.
Let's look at a very simple example (using "GPT-4o-mini" via OpenAI's API; you can run similar prompts locally using OLLAMA or Hugging Face Transformers):
PROMPT: Classify the sentiment of the following sentences as Positive, Negative, or Neutral. Sentence: "I absolutely loved the new Batman movie!" Sentiment: Positive Sentence: "The food was okay, nothing special". Sentiment: Neutral Sentence: "I'm really disappointed by the poor customer service". Sentiment: Negative Sentence: "The book was thrilling and kept me engaged the whole time". Sentiment: RESPONSE: (GPT 4o-mini) PositiveDo the frontier models today need these few-shot examples to complete their task correctly? No - they already excel at zero-shot learning and don't need spoon-fed examples to understand basic instructions.
However, you can think of Few-Shot Prompting as a personalization tool, guiding the model to cater to your specific needs. You can:
In Role Prompting, you instruct the model to take on a specific persona or role, which helps contextualize its responses. For example: "You are a biology professor. Answer the question in detail".
Without guidance, your model's style or assumptions may not match your needs. Role prompting can steer your model's output style (e.G., "Act as a JSON formatter"), tone (e.G., "Pretend you are an angry pirate for this response"), and/or content boundaries (e.G., "You are a legal assistant"). You can implement role prompting via system messages (in chat models) or by simply including a persona statement in your prompt. It's as simple as that - just tell your model who it should be, as illustrated in the example below.
PROMPT (without role prompting): Explain Photosynthesis in 1-2 sentences. RESPONSE: (GPT 4o-mini) Photosynthesis is the process by which plants, algae, and some bacteria convert light energy into chemical energy, producing glucose and oxygen. This occurs in the chloroplasts, using sunlight, water, and carbon dioxide. PROMPT (with role prompting): You are a primary school teacher. Explain Photosynthesis to your students in 1-2 sentences. RESPONSE: (GPT 4o-mini) Photosynthesis is how plants make their own food using sunlight, water, and air. They turn these things into sugar to grow and give off oxygen, which we breathe!Popular AI assistants use role prompts extensively:
If you are a software developer trying to solve a complex algorithmic problem or a software architect trying to analyze complex system bottlenecks and vulnerabilities, you will probably brainstorm various ideas with your colleagues to understand their pros and cons, break down the problem into smaller tasks, and then solve it iteratively, rather than jumping to the solution right away. In Chain-of-Thought (CoT) prompting, you encourage the model to follow a very similar process and think aloud by breaking the problem down into a step-by-step process. Especially when prompting complex tasks, CoT can prevent the model from jumping to pre-learned and often incorrect patterns prematurely.
If you're using state-of-the-art or advanced reasoning models today (e.G., Claude-3.7, OpenAI o1), you may observe step-by-step reasoning under the hood without explicit prompting. However, explicitly instructing these models to "think step by step" remains a powerful approach, especially when using older or smaller models. OpenAI and Anthropic recommend use of CoT in their prompt engineering guides.
By nudging the models to produce intermediate reasoning steps, you can:
Let's look at an example below:
PROMPT (with CoT): A user recently watched and enjoyed the movies 'Inception,' 'The Social Network,' and 'Blade Runner 2049.' Think step-by-step about what these movies have in common, such as themes, genres, style, or storytelling, and then recommend a few more movies or tv shows the user might like. RESPONSE: (o4-mini) Here's my quick breakdown and some picks you might enjoy: Step-by-step commonalities 1. High-concept premises – each film poses a big "what if?" (dream control, social media origin, AI-driven future). 2. Smart, twisty narratives – nonlinear or slowly-unfolding storytelling that rewards close attention. 3. Tech-oriented/social commentary – they explore how technology shapes identity, society, or reality. 4. Striking visuals & mood – strong production design, immersive cinematography, atmospheric score. Recommendations ● Memento (2000) – Nolan's early mind-bender of memory loss told in reverse. ● Ex Machina (2014) – A tense, intimate study of AI consciousness and human manipulation. ● Arrival (2016) – Thought-provoking first-contact drama where language and time intertwine. ● Black Mirror (TV series, 2011– ) – Standalone episodes examining near-future tech's dark side. Each shares that blend of brains-and-cinema you seem to love. Enjoy! Retrieval-Augmented Generation (RAG) PatternFrom startups to tech giants alike, RAG has become a standard best practice across the AI industry. Advanced models learn from vast amounts of real-world data. Ask them about history, science, or popular facts, and they'll usually answer correctly. Yet, these models have limitations. Their training cuts off at a specific date, their knowledge is general rather than specialized, and they don't have access to the newest, proprietary, or dynamically changing information.
This is precisely where RAG helps. RAG combines the model's reasoning abilities with real-time access to external knowledge (like databases, vector stores, or documents). So you get the best of both worlds.
Imagine building a chatbot for your law firm. With RAG, when a client asks about a specific legal issue, your chatbot can instantly retrieve relevant statutes and recent case summaries from your internal knowledge base, creating an accurate, well-supported response.
Figure 2: Retrieval-Augmented Generation
When building AI systems, you should consider using RAGs when:
Prompting and Context Patterns we discussed thus far can help reduce ambiguity, inconsistency, and hallucinations through better instructions and grounded context. However, you may soon notice that additional safeguards are needed to handle ethical, fairness, and safety issues. Even accurate responses can be biased, harmful, or inappropriate. This is where Responsible AI Patterns come in.
As part of building responsible AI systems, you must reduce hallucinations, prevent inappropriate or disallowed content, mitigate biases, and ensure transparency around AI decision-making. Otherwise, your AI outputs may mislead users, spread misinformation, or even create liability issues.
Techniques like RAGs, discussed earlier, already help reduce hallucinations by grounding outputs in an external context. Let us look at a few additional patterns that focuses on safety, fairness, and ethical compliance that go beyond accuracy alone.
Figure 3: Sequence Diagram Illustrating Responsible Patterns in Modern AI-based Systems
[Click here to expand image above to full-size]
Output Guardrails PatternEven when you do everything right, models may still produce incorrect, biased, or harmful content. You need guardrails! These are rules, checks, or interventions applied after the model generates an output. They act as your final defense to modify or block the content before it reaches users. Guardrails are particularly important for sensitive domains such as legal or medical applications.
Depending on your domain and use case, you can implement guardrails in several ways. For example, you can:
Many model providers also integrate fairness and ethics checks into their own guardrail pipelines. For example, Anthropic's Claude models follow a constitutional approach where outputs are revised according to predefined ethical principles. However, having your own guardrail layer will provide a consistent experience for your users, regardless of which model or provider you use.
Model Critic PatternBeyond basic guardrails, you can use a dedicated fact-checking or "critic" model to verify your primary model's output. This second model can be different or the same one in a "critic" or "judge" role. It's analogous to an author or editor reviewing and correcting a draft. Even if the first pass contains hallucinations, this verification loop makes the model check its facts, reducing false information and bias in your final output.
Adding a secondary judge or critic isn't always practical without increasing system complexity, latency, or cost. However, you should definitely consider this approach for automated QA testing. Consider a scenario where your production system uses a smaller "mini" or "nano" LLM version for efficiency. You could use the larger model as a judge in your offline testing to validate accuracy and ensure responsible outputs are generated. Github Copilot, for example, uses a second LLM to evaluate its primary model.
User Experience (UX) PatternsAfter stabilizing your outputs with proper prompts and guardrails, your next big concern is the user experience (UX). AI systems don't behave like traditional software systems and often produce unpredictable, open-ended content that may occasionally be wrong, slow, or confusing. Similarly, users have different expectations for these tools. For example, they might want to ask follow-up questions, refine the AI's responses, or see disclaimers when the AI isn't sure.
That's why well-defined UX patterns are essential. They help you, the developer, handle these new types of interactions in a user-friendly way to keep users engaged and satisfied and promote transparency. There are many techniques you can use to smooth these complexities, such as:
Let's look at a few illustrative UX patterns in detail.
Contextual Guidance PatternThis may seem obvious, but many new AI tools launch without proper user guidance. Users often don't understand how to interact with these tools or know their capabilities and limitations. Don't assume users will immediately know how to use your tool. Lower their learning curve by providing prompt examples, contextual tips, and quick feature overviews. Show these aids at the right moment in users' journey when they need them. For instance, in Notion, pressing the spacebar in an empty page triggers writing suggestions (since users likely want to draft content), while selecting text brings up editing options like "Improve writing" or "Change tone", displayed alongside the original text for easy comparison.
Figure 4: An illustration of contextual guidance
Editable Output PatternWith GenAI models, there is no single correct answer in many scenarios. Your best output depends on the context, application, and user preferences. Recognizing this, you should consider letting users modify or rewrite generated content. This creates a better perception of human-AI collaboration. Your tool will no longer be a black box, giving users control over their final outputs. Sometimes, this is an obvious feature (like GitHub Copilot letting users edit suggested code directly in their IDE). In other cases, it's a deliberate design choice (such as ChatGPT's canvas).
Figure 5: An illustration of the editable output pattern
Iterative Exploration PatternNever assume the first output will satisfy users. Include "regenerate" or "try again" buttons so users can quickly iterate. For image generation, show multiple options at once. In chatbots, allow users to refine or follow up on responses. This feedback loop helps users discover the best output without feeling stuck. Microsoft research shows that when users try many prompts, newer attempts sometimes perform worse than earlier ones – so letting them revert to previous outputs (or combine parts from different generations) significantly improves their experience.
Figure 6: An illustration of an AI video editor tool allowing iterative exploration
[Click here to expand image above to full-size]
AI-Ops PatternsWhen you start putting your AI software into production, you'll face new operational challenges that traditional software doesn't have. You'll still need versioning, monitoring, and rollbacks, but now your core "logic" lives in prompts, model configurations, and generative pipelines. Additionally, GenAI outputs can be unpredictable, requiring new testing and evaluation methods.
Think of AI-Ops as DevOps specifically for modern AI systems. You're not just deploying code; you're shipping AI "knowledge" embedded in prompt-model-config combinations that might change weekly. You must manage performance and cost, track user interactions, identify regressions, and maintain reliable, available systems.
You can adopt many familiar operational tactics from traditional software, plus an entirely new set of AI-specific methods you've never needed before. Let's look at a couple of AI-specific patterns in detail (though this is just a tiny sample of the complete playbook) to understand nuances of AI-Ops.
Metrics-Driven AI-Ops PatternWhen your change goes to production, track everything: latency, token usage, user acceptance rate, and cost per call. Define success metrics that matter most for your business. It could be a daily acceptance score from user feedback or a "hallucination rate" measured by an LLM-judge pipeline. Set up alerts if those metrics dip. This data-driven approach lets you quickly detect when a new model or prompt version hurts quality. Then, you can roll back or run an A/B test to confirm. Think of metrics as your safety net in an unpredictable environment.
Prompt-Model-Config Versioning PatternYour AI system can fail if there are uncontrolled prompt changes, configuration tweaks, or ad-hoc model swaps. If you consider each (prompt, model, configuration) combination a "release", you can manage it like any other software build. To ensure no regressions, you must tag it with version tags, QA tests, and a golden dataset. Automated pipelines can run these test queries whenever you update a prompt, modify config settings, or switch from one model to another. If the outputs degrade according to your metrics, you revert. This discipline prevents "stealth changes" that break your UX.
In addition to AI-specific practices, you should continue standard operational best practices from traditional software development, such as:
Here is how your typical AI-Ops workflow would look:
Figure 7: An AI-Ops workflow illustration for managing, testing, and deploying AI System changes
[Click here to expand image above to full-size]
Optimization PatternsAs your AI application grows, you'll face operational bottlenecks. API rate limits, increasing latency, and rapidly rising inference costs can quickly become significant challenges. That impressive prototype your leadership loved becomes suddenly unsustainable in production.
These problems are common but manageable if you always follow some best practices. For example, don't automatically choose the biggest, most powerful model. Ask yourself if your task can be handled faster, cheaper, or with reused outputs. Similarly, you must make smart optimization choices for your system, whether redirecting traffic away from unnecessarily powerful models, caching predictable responses, batching queries near-real-time, or developing smaller specialized models.
Let us dive into three powerful optimization patterns that you can directly implement in your AI workflows:
Prompt Caching PatternThe fastest, cheapest LLM call is the one you don't make. Consider caching and reusing responses if your system frequently uses identical or similar prompts. This works exceptionally well for documentation assistants, customer support bots, or internal chat tools where user questions often repeat.
Even more effective is prefix caching, where you can cache the expensive part of the prompt (e.G., system instructions or few-shot examples). Amazon Bedrock (and many others) supports this feature natively and reports up to 85% latency reduction on large prompts.
Continuous Dynamic Batching PatternIf you manage a high-volume AI system, maximizing GPU utilization and system throughput is critical for minimizing costs and efficiently scaling. If you process each query sequentially, you will underutilize your computing resources, pay more fees, and perhaps hit API limits sooner.
Instead of processing each request as soon as it arrives, consider waiting briefly, perhaps tens to hundreds of milliseconds, depending on your application's latency tolerance, to batch your incoming requests dynamically. You can then process these batches through your inference servers and LLMs. This approach can help increase your system's throughput and ensure your GPUs operate at near-optimal utilization.
While you could implement custom queuing and batching logic in bespoke systems, production-ready tools such as vLLM, NVIDIA Triton Inference Server, and AWS Bedrock offer robust, out-of-the-box solutions suitable for most use cases.
Intelligent Model Routing PatternRather than indiscriminately sending every request to your largest, most expensive model, implement intelligent model routing. The idea is simple but powerful. Introduce a lightweight, preliminary model at the entry point, similar to a reverse proxy or API gateway in traditional microservices. Like a reverse proxy, this model can help with load balancing between models, caching frequent responses, and gracefully handling fallbacks. It also serves as an API gateway, intelligently routing queries to the appropriate downstream models based on the complexity or context of each request.
For common or repetitive queries, the routing model can directly pull from caches or prefetches, altogether avoiding model inference. For queries requiring moderate reasoning or domain-specific knowledge, route to specialized, cost-effective models. You should only route the most complex or ambiguous queries to your largest, general-purpose models.
Intelligent Model Routing Pattern is particularly useful if you are building general-purpose systems handling diverse queries. This pattern can balance cost-efficiency and model accuracy, ensuring each query uses precisely the computational resources it requires.
Advanced PatternsThis article explored foundational patterns that can help you incorporate best practices into different stages of AI software development. However, there are several advanced areas we intentionally didn't cover. However, I want to briefly mention three key topics with many emerging patterns and best practices, as they're becoming critically important in modern AI systems.
These advanced concepts are beyond our current scope. Recognizing their importance is key to keeping up with evolving trends in modern AI systems. Watch out for the ever-growing collection of innovative AI patterns, and keep adding them to your arsenal. They can help you unlock even more powerful and specialized applications!
AI Agents: From Co-pilot To Autopilot
AI is moving from "co-pilot" to "autopilot". The development of generative artificial intelligence is increasingly focused on "agentic AI": the use of AI agents that perform tasks autonomously, either within fixed parameters or to achieve goals set by the user.
Bring in the agentsAI agents are not new but they are becoming ever more sophisticated. In their basic form they are simply tools built to carry out tasks such as answering queries to a script, as chatbots do, or fetching information from the web. These functions are limited, requiring no follow-up action without further input. Such reactive AI systems operate solely on programmed responses.
More complex AI agents, with autonomy and adaptability, have also been around for a long time. They control home thermostats and automate factory processes.
This type of technology is, however, rapidly developing capabilities beyond fetching and delivering information or performing distinct tasks. AI agents
Such adaptable AI can perform increasingly complex actions in pursuit of a goal and without taking a prescribed path. Using advanced machine learning and neural networks, it can understand context, analyse and respond to dynamic situations, learn from experience and use problem-solving and reasoning to make strategic decisions.
Predictive capabilities based on historical statistical analysis add another layer, enabling AI agents to plan, automate and execute tasks as well as to make informed decisions with specific goals in mind. They carry out their tasks after being given natural language prompts and without constant user input. They can also be designed to check each other's work in an iterative process that improves quality and reliability.
Several developments have enabled AI agents to become more complex while at the same time being easier to use. Generative AI has provided a natural language interface, broadening access to AI, especially for users who are less tech-savvy. Generative AI interprets a prompt by the user then other AI fulfils the task. Google says: "Generative AI is just one piece of the AI puzzle. Other AI technologies, like predictive AI, vision AI, and conversational AI, are crucial for building sophisticated AI agents."
Advances in computing power and memory have enabled large language models and more sophisticated machine learning. The understanding of context and the ability to plan has improved as AI systems learn more data and improve their capacity to remember interactions.
These are the foundations for AI agents, with the ease of interaction accelerating development as more users gain access. At the same time AI itself is speeding up the innovation cycle, refining its outputs and creating iterative processes at ever higher speeds.
Hype or reality?AI agents can speed up analysis and decisions as well as taking over certain functions from employees but they still fall short of full autonomy.
Cassie Kozyrkov, the founder and chief executive of Decision Intelligence and formerly chief decision scientist at Google, says AI agents' main role in an enterprise still lies in taking over repetitive tasks with "well understood and well designed processes" that do not require "creative spin".
While there is huge potential for agentic AI to perform ever more complex tasks, Pascal Bornet, an expert in automation and author of Agentic Artificial Intelligence, points to a "significant gap" between hype and reality. Even with a clear directive, systems cannot yet perform complex tasks end to end, especially in nuanced or novel situations, without some human oversight.
That said, the field "is advancing rapidly". Bornet likens development to the progression from fully manual to fully autonomous cars, which is rated from level zero to level five. Currently, autonomous cars operate at levels two to four, depending on the environment. Automation can handle many tasks but human oversight, and occasional intervention, is needed.
AI agents are at a similar stage. Most operate at levels two or three, with some "specialised systems" reaching level four in tightly defined domains. Level five, where agents fully understand, plan and execute complex missions with minimal human input across any domain or corporate boundary, remains theoretical.
Given the challenges involved in folding capabilities into a coherent system, fully integrated multimodal agents are some way off but Bornet says the building blocks are in place. He says some applications, such as that developed for veterinarians by Pets at Home, the UK FT250 company, exemplify audio processing but multimodal systems will require a sophisticated orchestration of agents with different types of expertise.
While some sectors have adopted agentic AI more than others, as covered below, it can be put to work in functions that are common to most businesses. Bornet says the opportunity is systemic. "Agentic AI isn't coming for a [single] department, it's coming for all of them. Every workflow with friction is a use case waiting to be transformed."
Currently agents are used mostly in internal roles to gain efficiency and savings rather than top-line growth. A 2025 report from UK Finance co-authored with Accenture said: "Most near-term uses involve single-agent deployments targeting productivity and efficiency gains and improvements to customer and colleague experience". The trade body found "relatively few" examples within financial services aimed at increasing sales or revenue. It also noted that most deployments were "closely monitored by an employee acting as a competent supervisor".
Across industry, AI that can reduce the time spent on mundane work to "free up" employees for more creative or skilled tasks has been adopted faster than elsewhere.
Bornet and his team have gathered data from 167 companies in various sectors that have deployed what he classifies as level three LLM-based agents in production environments. Customer service, internal operations, and sales and marketing functions have seen the highest adoption, with benefits ranging from time savings of 12 to 30 per cent in customer service, 30 to 90 per cent in internal operations and increased revenue of nine to 21 per cent for sales and marketing teams.
It should be noted that the use of AI agents alongside humans does not always enhance performance. An analysis of a customer service software company by the US National Bureau of Economic Research found that AI both improved issue resolution and cut the time taken. However it was newer staff who benefited most, with the AI electronically transferring the knowledge of experienced people. The performance of older hands did not improve.
The reverse can be true in roles that are highly skilled. Attila Kecsmar, the co-founder and chief executive of Antavo, the AI loyalty cloud programme platform, says that in more technical areas, such as programming, those who use AI without an adequate understanding of the output will struggle, while the productivity and speed of competent workers will be supercharged.
This has been the most visible deployment of AI from a consumer perspective but feedback has been mixed. Industry proponents say how well chatbots perform but customer surveys suggest the opposite. Preferences could change as customer service agents develop and digital natives make up more of the consumer base. Better responses and 24/7 support may improve customer perceptions.
Older agents answered queries based on set scripts that quickly ran out of road, especially with complex queries. Newer agents, given their ability to remember and respond to dynamic inputs, are more responsive. They are able to act based on up to date client data as well as to recall historical interactions with customers.
With agentic AI, customer service interfaces have developed beyond dial-up chatbots. Google Gemini is behind Volkswagen's MyVW app, a virtual assistant that answers a driver's queries about their car.
The application of AI in coding is well documented. In a report by the McKinsey consultancy, Lenovo said that its engineers' speed and quality of code production improved by 10 per cent.
Kecsmar agrees that agent-supported engineers can achieve much more but says this in turn will lead to rising expectations for human productivity and performance. Given natural language interfaces, it is increasingly feasible for laypeople to write code.
This is the real revolution in agentic AI, Kozyrkov says. "Before, you had to go and get yourself schooled in the arcane arts of some new language and now you don't — you speak your mother tongue and it works."
While this presents an opportunity, she cautions that it is also one of the greatest risks in deploying AI in an enterprise. "Unfortunately the mother tongue is vague and not everybody knows when they're being ambiguous. Now you can program a machine without thinking it through, so it's hardly a surprise that you get unintended consequences."
Marketing and campaign managementAs covered in our report on personalisation and marketing, AI has hugely expanded the reach of marketing departments, enabling mass communications to be targeted at ever smaller segments.
AI agents can take this further. Antavo has developed an AI agent for its brand customers which helps them to devise and communicate loyalty programmes and campaigns. It can decide an appropriate approach for a brand in any sector and analyse data and give ideas, illustrated with charts, on how to optimise and develop a programme. It can also look inwards, finding and delivering relevant information to help customer service employees resolve consumers' queries.
Human resourcesAI agents can be used in hiring, scheduling meetings, retention and management, predicting turnover and identifying where training may be required.
Virtual assistantsThese are capable of executing simple tasks with minimal supervision, such as scheduling meetings with clients, sending standard emails and general client communications. Claude, Anthropic's AI model, can find information from many sources in a computer so that it can complete a form.
FinanceApplications include AI systems that can make trading decisions based on real-time data analysis or systems that suggest investment strategies based on a client's profile. AI can also help with identifying fraud, flagging its suspicions in real time.
HealthcareAutonomous diagnostic tools can identify problems using patient histories and images, recommend personalised healthcare treatments, monitor patient health and recommend or remind people about follow-up actions. AI agents can be deployed in robotic-assisted surgery to improve control and accuracy. Pattern recognition, deep learning and computer vision all enhance machines' ability to adjust surgery incisions in real time. Systems such as Philips' IntelliVue Guardian manage postsurgical complications by providing early warnings for those patients most at risk.
LawIn addition to simple and repetitive tasks such as contract drafting, agents can advise on cases. Based on analysis of historical data or judges' rulings they can predict potential outcomes to a suit and suggest arguments.
Already A&O Shearman, the international law firm, is using an AI tool created in collaboration with Harvey, a start-up. This makes use of a business's financial information to assess in which jurisdictions a client needs to file in the event of a merger. It then identifies any missing data and drafts the information requests for each party.
Manufacturing and logisticsWhile autonomous cars have yet to reach the mainstream, autonomous lorries are about to arrive. Aurora Innovation, which works with Volvo, Uber and FedEx in the US, plans to use 10 driverless lorries between Dallas and Houston. AI agents are also used in manufacturing for monitoring and maintaining equipment and optimising processes. They can perform quality control on both inputs and outputs with greater consistency than humans.
RetailBeside the chatbots deployed in customer service, AI agents can be used along the supply chain to monitor and manage inventory levels based on historical data and to predict trends and demands.
DrawbacksThere are various issues that enterprises need to consider when adopting AI.
Companies operating with legacy tech or which have inadequate or inconsistent data will find it harder to make progress. Any data quality issues experienced when training agents will be exacerbated by "slop" the colloquial name for the proliferation of LLM-created content.
EY says this could be solved in part by agents sourcing information from several inputs rather than relying on static scraped data. For instance iterative AI could gather data from wearables, which would layer current and contextual data on top of historical information.
Generative AI is just one piece of the AI puzzle
Connection within and between companies is hampered by data incompatibilities as well as the inadequacies of existing application programming interfaces. Bornet says the lack of a standard protocol presents a hurdle to multi-agent systems that might otherwise cross corporate boundaries.
Kecsmar believes this problem may itself be solved by agents. "In future the agents developed around data exchange skills will be able to create their own data exchange. They will be uploaded with how their host company communicates data and they will have a tool call to interface data between different sources."
Trust is a problem in several areas, for instance in sectors where the options for reversal are limited. "'Fully automate and leave it' in the financial services industry is a terrible idea," Kozyrkov says, adding that "the golden rule of AI is that it makes mistakes". Consumers might be unwilling to let agents have autonomy over their bank accounts or credit cards. There is also a lack of trust among leaders in terms of AI performance and with workers who face the risk of replacement. Once systems can link up across business boundaries, will companies trust external agents?
Use of untrammelled AI also adds to cyber security threats by increasing points of access and the risk of unexpected actions. Kozyrkov says: "One of the top suggestions is: limit its access. Don't give it any data that you wouldn't want leaked." Granting AI the same access as a human employee dramatically increases the attack surface, meaning systems are more vulnerable.
Constraint on computing capacity is a further hurdle. Despite the investment in infrastructure the competition for stretched resources is fierce. Still, no user pays what it costs to run an AI query even in energy terms, a point raised at an FT Climate Capital Council round table last year. For companies using commercial services, current pricing is based on the number of employees — but what will happen if staff levels shrink due to AI adoption?
Companies also need to consider the ethical implications of AI adoption. Research at Cambridge university notes that — if they cannot already — agents may soon be able to predict our habits and spending patterns and influence or manipulate them, although this is likely to be of greater concern to consumers.
Accountability is another imponderable. With whom does this lie when agents are carrying out end to end tasks without human intervention, or with connections to other companies?
How to adopt AI agentsAs with any new technology, it is important to identify business needs first. Bornet says the most sophisticated option is not necessarily always the best — suc

Comments
Post a Comment