Introduction to AI agents

This is the first couple pages of a report I worked on about AI agents.

What are “AI Agents”?

In this section, we try to define what an agentic system is. Automation software(like email to calendar booking tools) are generally oversold as agentic systems and it’s important to ensure we all have the same understanding of the lingo used in this space.

Peter Norvig in his popular book defines an agent as anything that can perceive its environment and act upon that environment. A thermostat is an agent. A motion sensor light or smoke detector is also an agent. However, these are all dumb agents. We are more interested in building intelligent agents with LLM/AI as its core controller.

Interestingly, the definition of the term “AI agents” is hotly contested. Most of the common ones involve some version of putting LLMs + tools in a loop. Rather than looking at LLM systems in a binary way, it’s more useful to think of them to be agent-like to a different degree. The degree of control given to the LLM in guiding an application’s flow allows for varying levels of autonomy, which we shall call agentic. This helps us move away from the binary classification of systems and look at them as part of a spectrum. Ultimately, you want to give it a task and the AI is agentic enough to go and accomplish it.

What are the different types of AI agents?

Since we have established that all LLM+tool powered systems are agentic on some level, we need to first classify the different types of agents. Borrowing the definition from [Anthropic](https://www.anthropic.com/research/building-effective-agents, we have two types of agentic systems.

Workflow agents are systems which are built by chaining together LLMs and tools which then are orchestrated through pre-defined code paths

Types of workflow agents:

Autonomous agents are systems where the LLM dynamically directs the execution path and tool usage and maintain full ownership on how they accomplish the task. They have no predefined paths to take. They start their flow from a simple command or conversation with the user, plan the task and use tools to accomplish the task. They might choose to ask humans for feedback or when encountering blockers and automatically terminate themselves when the task is completed.

They are particularly useful for open-ended tasks where it’s hard to predict beforehand the number of steps required and where you can’t really hard code a fixed path. Some real world applications include CX agents which, based on the user conversation, automatically decide to access customer data and perform actions like issuing refunds and updating ticket statuses. In software development, coding autonomous agents are excellent because the solutions are verifiable, agents can iterate based on feedback from automated tests and the problem space is well defined

Workflow agents in general are more predictive, consistent, cheap and fast whereas the autonomous agents generally excel in complex tasks where you need flexibility at scale.

Architecture of autonomous agents

Before we jump into the tooling and infrastructure world for AI agents, it’s imperative to understand what exactly they are made of. While the LLM functions as the core brain, its complemented with some key components:

Why did they blow up in 2024?

The LLM functions as the agent’s brain and agents typically require much powerful and bigger models compared to non-agentic use cases. This is because mistakes in multi step tasks get compounded. If an LLM is accurate, say 90% on a single step, this accuracy drops to 35% over 10 steps(0.9^10) and 12% over 20 steps. Smaller LLMs (and LLMs not trained on agentic flows) are simply unfeasible for these use cases.

Lot of core frameworks for agent planning were invented in 2022 and 2023. CoT or Tree of thoughts for example are standard prompting techniques for enhancing model performance on complex tasks. This forces the model to spend more test time compute in thinking step by step and break down big tasks into multiple smaller subtasks. Even self reflection frameworks (like ReACT) were invented in 2023 which help the LLM improve iteratively by refining past actions and correcting previous mistakes.

An action sequence of a simple multi step agent

Lets consider a scenario where you ask your AI data analytics agent this question - “Analyze customer churn rates for our app subscribers”. This might make it go and perform the following:

When and where are AI agents used?

The value add for agents is clear. Copilot was 40% of github revenue last year. Klarna AI agent handles 65% of the CX queries end-to-end. In this section, we look at ideal use cases and tasks for deploying AI agents.

While we have briefly looked at which agent architecture works for which kind of flows, we should also define what makes a task agentic vs non-agentic.

A market map from Felicis showing some early winners in the AI agents space

·