Publication date 18/06/2026
Visual que representa la IA agéntica
Description

Generative AI has very quickly gone from being a relevant technology only in academic circles to becoming a tool of everyday use for millions of people and organizations, both private and public. However, in the last year, the concept that has come to dominate the conversation about technological innovation has been the agentic AI.

On many occasions, generative AI and agent AI are being presented as rival technologies, and in particular as if agent AI had been born to replace generative AI, when the reality is that they are two concepts that we can consider complementary. The confusion is understandable, since in practice many software systems incorporate generative models, and some generative AI-based assistants can use tools  (even if only in a limited way to, for example, retrieve information from an API or database). We can simplify the differentiation by saying that a generative AI model can be integrated into an agent AI system, but not every application based on a generative model can be considered an AI agent.

What is generative AI?

Generative AI can be simply defined as the family of systems that produce new content from structures learned from large volumes of data. The U.S. government's National Institute of Standards and Technology (NIST) describes it as the class of AI models that emulate the structure and characteristics of input data to generate derived synthetic content, including text, image, audio, and video.

The Organisation for Economic Co-operation and Development explains it in perhaps a more informative way, defining generative AI as the branch of artificial intelligence capable of creating new content (text, images, video, music, code, etc.). from patterns learned in large volumes of data. This technology is already being used in tasks as diverse as programming, information searching, education or medical diagnosis.

Therefore, its main characteristic is generation. Something is asked of a generative AI system, the system interprets the context and returns a probable output that also tries to be consistent with the request. Even when connected to a specialized document base to improve the context of the request, for example through RAG approaches, its primary mission remains to produce a response or a useful piece of content for the requester. For this reason, the usefulness of a generative AI model is measured, above all, through the evaluation of indicators such as quality, relevance,  coherence of the response or time savings in the performance of the task.

From a functional point of view, the main generative AI models have already achieved performance levels equivalent to or superior to humans in specific tasks of comprehension, generation and manipulation of language, although it is always advisable to interpret these results within the limits of each reference frame (benchmark).

However, these results do not imply a general understanding equivalent to that of humans, but should be interpreted as great performance in specific tasks. As a result, all generative AI uses still require very rigorous validation and proper monitoring so that we can be confident in its results.

The Risks of generative AI They are also known: hallucination o generation of erroneous content presented safely, problems of Information integrity, biases, potentially dangerous content, Risks to people's privacy, problems aligning with expected behavior, or Conflicts over intellectual property rights. In addition, as the NIST in its framework for generative AI risk management , it not only amplifies risks already existing in other AI systems, but also introduces risks of its own or augmented by the ability to produce content on a large scale.

On the other hand, when the output of the model begins to decide the workflow, select tools or modify the course of execution, we enter a different level of autonomy, which corresponds to the Agent AI.

What do we mean by Agent AI?

The definition of agent AI, due to its novelty, has a much lower level of consensus at the moment  than in the case of generative AI. For example, the Government of the United Kingdom (gov.uk) defines it as the set of systems composed of software agents that can behave and interact autonomously to achieve a goal. In the same vein, the Singapore government's technical portal explains that these are systems in which AI agents pursue objectives and execute tasks autonomously, especially in domains where they must make decisions and act within a delimited framework.

The European Commission does not yet define agent AI as a separate regulatory category, so it could be placed within the evolution of advanced AI systems. In the Apply AI Strategy, refers to AI agents as systems capable of making decisions and executing actions independently, understand language, reason about tasks, act autonomously to achieve predefined objectives and relate to their environment, including the orchestration of interactions with people. This approach fits in with the Broad definition of the European Artificial Intelligence Regulation, which understands artificial intelligence systems as "machine-based systems designed to operate with varying levels of autonomy [...] and generate outputs, such as predictions, content, recommendations, or decisions, that can influence physical or virtual environments”.

We can therefore summarize the central idea in that the objective is not only to "speak" or "write" well, but to act with operational criteria within a series of previously authorized capacities. Instead of returning a single response to an instruction, an agent can receive a high-level objective, break it down into subtasks, choose which tools or functions it needs, query APIs, retrieve data, evaluate intermediate results, and continue iterating until the task is complete. Gov.uk explains it quite clearly: in agent systems, the execution path is not completely prefixed, but is intelligently decided during the process, using generative AI to plan and use the necessary tools through open standards such as the Model Context Protocol (MCP).

How system Architectures Change

The typical architecture of a generative AI system  must respond to a relatively linear flow: a person makes a request through instructions (prompt), the AI model interprets the context, and produces a response. Only when you want to increase accuracy in specific domains, techniques such as RAG are added to retrieve more accurate information in real time from document bases or vector databases.

However, in the Agent AI the pattern is much more complex, since the system combines several modules together with the base model to include mechanisms for planning, maintaining memory, calling tools, recording decisions and coordinating actions between components or even with other agents. In addition, it is necessary to monitor the operation and, in many cases, include a layer of human supervision. The gov.uk documentation even distinguishes between orchestration (a central controller that commands steps) and choreography (more distributed, event-oriented behaviors), a distinction especially useful for understanding why agents are more flexible, but also more difficult to assess and audit.

The development of agent AI also requires: Data and services ready to be consumed with guarantees by machines; And this is where availability Rich metadataCatalogsAPI , version control, quality information, Licenses or Service Level Agreements, among others, can make a difference in the robust and reliable deployment of this technology.

How the way risk is assessed and reduced is changing

We start from the premise that the evaluation of generative AI is already a very demanding activity in itself and that being able to explain the decisions of an artificial intelligence system is unavoidable when we talk about delegating tasks that were previously performed by a human.

If generative AI is evaluated above all by the quality of the responses, the agentic AI forces the entire behavior of the system to be evaluated. That is, it is necessary to ask questions such as whether the chosen planning has been adequate to solve the task, if the selected tool was the correct one, if the data query respected the set rules, if the system knew how to stop when it should, if it left enough traces to audit what happened or if it asked a human to respect the limits with which its autonomy was designed.

The  most recent benchmarks for Agent AI go precisely in this direction: GAIA evaluates general assistants in web browsing and tool use, StableToolBench works on resolution rates and stability in the use of APIs, MultiAgentBench introduces indicators of collaboration and competition between multiple agents and Agent Security Bench It focuses on attacks and defenses, showing very relevant vulnerabilities in prompt injection, memory and tool use.

The AI Regulation in particular reinforces the Importance of human supervision Article 14 thereof, establishing that the objective of such supervision is to prevent or minimise risks that may arise during the use of AI systems. For the agentic AI, this is especially important because the greater the operational autonomy of the system, the more necessary it is to define when it should ask for confirmation, when it should stop, what actions it can perform without authorization and what decisions should always be under human control. In Spain, the Spanish Supervisory Agency for Artificial Intelligence Supervision (AESIA) helping to ground the European AI Regulation in more operational orientations through its Practical guides for RIA compliance, prepared by the Secretary of State for Digitalisation and Artificial Intelligence, through the Directorate-General for Artificial Intelligence, and with the collaboration of AESIA itself.

Visual title: Generative AI vs Agent AI    Generative AI  What it is: Systems that create new content (text, image, and code) from learned data.  Objective: to generate useful content.  Example: writing an email, summarizing a report, generating the source code of an application, creating a banner or a video, etc.  Evaluation[LM17] : the response (quality, coherence and relevance of the content created) is evaluated.    Agentic AI  What it is: systems that act to achieve objectives with autonomy.  Objective: to decide and execute tasks.  Example: Travel organization assistant that searches for flights, compares prices, books hotels, and adjusts the plan using different tools without continuous user intervention.  Evaluation: the entire behavior (planning, use of tools, traceability, audit, and result) is evaluated.    Source: own elaboration – datos.gob.es.

Figure 1. The difference between generative AI and agent AI. Source: own elaboration - datos.gob.es

Ultimately, the difference between generative AI and agent AI is mostly a matter of purpose. The first specializes in producing useful content from instructions; the second is to coordinate actions to achieve objectives, but in both cases, both open datasets and good data governance will continue to be fundamental pillars for them to achieve their objectives. In addition, not all use cases need to complete the leap to agentic AI, as AI agents are particularly suitable when the workflow is non-deterministic, i.e. when the next step needs to be decided based on context. As Singapore's own GovTech argues, AI agents can even be overkill where a simpler system already does the job well.

Content prepared by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization. The content and views expressed in this publication are the sole responsibility of the author.

Comments