Mechanisms for monitoring Artificial Intelligence systems: Human in the Loop, Human on the Loop and Human in Command

Publication date 07/05/2026

Description

Artificial intelligence is rapidly changing the way we make personal and professional decisions, the way we manage the services provided by our companies and the criteria with which we process information in our day-to-day lives. Currently, we already have systems capable of helping to prioritise waiting lists or analyse diagnostic tests in the health field, detect the risk of dropping out of school or personalise learning itineraries in education and assess suspicious banking transactions, in addition to summarizing files, classifying documents, recommending actions and generating drafts or even interacting with us in natural language in a customer service. In many cases, without us really being aware of it, since these processes reside within the companies and public administrations with which we have interactions.

Perhaps for this reason, as AI systems become more autonomous and we move from language models that answer our questions to autonomous agents capable of solving our tasks completely, doubts and questions arise about the role that humans should play in these new processes and systems; and more importantly, about the responsibility for the decisions that are made.

That leap has meant that the conversation no longer revolves only around whether it is convenient to use AI in a certain process, but about how responsibility is shared between "machines" and people. It is no coincidence that the first principle of the European Commission's Ethical Guide for Trustworthy AI is that of human agency and oversight. In other words, AI must be at the service of people, strengthening their decision-making capacity and, of course, always having effective supervision mechanisms.

Between the security of total control and the efficiency of autonomy

In this context, approaches such as human-in-the-loop (HITL), human-on-the-loop (HOTL) and human-in-command (HIC) make sense, which describe different ways of articulating this human intervention depending on the context and level of risk of each system. They are therefore not equivalent labels, but describe the different ways in which people can intervene in the supervision and decision-making associated with artificial intelligence systems.

In the case of Human in the Loop (HITL), the human being intervenes directly in the system's decision cycle. That is, artificial intelligence suggests, but it is a human agent who validates or corrects before the action is executed. This could be the case, for example, of a system that analyses applications for social benefits. We would design a process in which the AI would extract data from documents, cross-reference it with information from other systems to validate it, and prepare a draft resolution. However, in the final part of the process it would be an employee who would decide and sign after validating the proposal made by the automated system. It would therefore be the most conservative model and also the most used when decisions can have significant consequences on people or their rights.
In the Human on the Loop (HOTL) model, the system can act autonomously, but a human agent monitors the process in real time and has the ability to intervene if it detects a problem. Unlike the HITL approach, a person is not necessarily involved in each individual decision, but maintains a monitoring role over the operation as a whole. It is the model that is used, for example, in fraud detection systems or in automated content filtering, where AI analyzes large volumes of information and generates alerts or executes preliminary actions continuously. It's about finding a balance between efficiency and control, and it's especially well-suited for high-volume environments where human intervention in each individual case wouldn't be feasible.
In the Human in Command (HIC) approach, human intervention is not necessarily situated in each individual decision or in the continuous monitoring of the system, but on a higher level of direction or governance. It is people who define what AI can be used for and define limits, quality criteria, the level of acceptable risk and under what circumstances its operation should be reviewed. This would be the case, for example, of an administration that used AI to prioritise inspections or to classify incidents: the system could operate quite autonomously, but it would be those responsible for determining its purpose and validating its operating rules. In addition, they would audit the results obtained and manage the incidents that occurred, and could even suspend the system in case of detecting unwanted effects. Rather than intervening on a case-by-case basis, the human role here is to ensure that AI remains aligned with service objectives and the regulatory framework.

In contrast to these governance models of the supervision of AI systems, the Human out of the Loop (HOOTL) approach is also often mentioned, which is the one in which the system works without direct human intervention. This is the highest degree of automation and, therefore, the scenario that requires the greatest precautions, since the margin of human correction during execution is very small or even non-existent, as is the case in some intelligent infrastructure management systems. For example, an AI system can automatically regulate the air conditioning of buildings based on sensors for temperature, occupancy, energy consumption, etc., without a person having to validate each decision. This model can be reasonable in very limited tasks with a low risk of error, such as automation processes without relevant effects on people's rights or interests. However, its application is much more problematic when AI influences sensitive decisions. Therefore, rather than as a generalizable option, the HOOTL model should be understood as a possibility limited to very specific contexts in which robust safeguards also exist.

The following visual summarizes these four types of approaches:

Figure 1. four approaches to human intervention in AI. Source: Authors' elaboration - datos.gob.es.

Why the choice of human oversight approach matters

The central paradox of human oversight in AI systems is that as these systems improve, in part because of this oversight, the pressure to reduce human intervention increases. If a model is 99% accurate, does it make sense to monitor the system, or is it better to accept errors? This question is legitimate from an efficiency point of view, but it nevertheless carries a risk that responsible AI specialists have called the problem of "automation bias". This bias is the human tendency to rely excessively on systems that generally work well, but can hide hard-to-detect errors or, in the worst case, self-serving manipulations. On the other hand, 1% may seem minimal, but the percentage could hide a very high number of errors or few errors at an unacceptable cost.

Therefore, the choice between the different models of supervision is not only technical, nor is it based on the search for efficiency, since it has ethical and legal implications, and not only operational ones. In particular, the European Artificial Intelligence Regulation (AI Act) dedicates its Article 14 to human supervision requirements for high-risk AI systems, stating that these must be designed in such a way that they can be "effectively monitored by natural persons during the period they are in use". This makes the deployment of a monitoring model a regulatory obligation for many applications.

The Spanish Guide to Human Supervision, prepared within the framework of the AI regulatory sandbox, explains it in detail, developing the five main requirements of Article 14 of the AI Regulation: effective surveillance during use, mechanisms to minimise risks, setting of responsibilities, transparency and traceability, and additional guarantees in certain systems. The Guide also stresses that "the final responsibility for the actions carried out by an AI system is the responsibility of the people of the provider and user entity responsible for it" so for "surveillance to be effective, people must have control over the system and be able to manage the risks that may arise from its use”. In other words, putting a person "in the loop" is not enough on its own, because if that person does not understand the system or does not have the authority to intervene, the supervision will be merely formal, but not effective.

Human supervision in the training of AI systems

There is another dimension of human supervision that is relevant in artificial intelligence systems and that is the role of this supervision in the training processes of these systems. Before a system goes into production, humans have already intervened in multiple decisions that condition its subsequent behavior:

On the one hand, we have human intervention in the lifecycle of training data. Human annotation integrates human intelligence directly into the AI development cycle. People decide what data is collected, what is discarded, how it is structured, and which variables are considered relevant. They are also involved in cleansing, anonymising, quality control or bias review of datasets. All of this has a great influence because a model does not simply learn from large volumes of information, but from the specific way in which those data sets have been prepared.
And, on the other hand, we have human intervention in the validation of the behavior of AI models with techniques such as reinforcement learning with human feedback (RLHF). In this technique, human annotators evaluate pairs of responses generated by the model, indicating which one is better. With these evaluations, a reward model is trained, which, in turn, guides the fine-tuning of the main model. It is, in essence, a Human in the Loop oversight of the training process: humans not only validate the final results, but actively shape the values of the system. Therefore, the diversity of the annotators, in cultural or social terms, has a direct impact on the behavior of the resulting model.

In both processes, human intervention is essential both to ensure the quality of the training data and to ensure the alignment of the models with the desired values.

Use cases in the public sector

One of the first documented cases in the public sector is that of the Australian Bureau of Statistics (ABS), presented as early as 2020 in the article Human-in-the-Loop AI in Government: A Case Study. The paper explains how a public agency can apply a Human in the Loop approach to automate part of the production of official statistics, using the Household Budget Survey as an example. The aim was not to eliminate human intervention, but to save time and resources on manual labour-intensive tasks, so that professionals could concentrate on higher value-added activities. Precisely therein lies the interest of the case, as it demonstrates that AI can be incorporated into processes that are very demanding in terms of quality thanks to human validation.

More recent is the initiative promoted by the London Office of Technology and Innovation (LOTI), which in March 2025 launched a research project specifically aimed at analysing what the role of civil servants should be as supervisors of AI systems in local public services. The starting point is the observation that many municipalities already appoint a person to review or approve the outputs of automated systems. However, this does not mean that there is truly effective supervision and, therefore, it seeks to generate practical recommendations so that municipalities and other public organizations can adequately design these supervisory roles. The value of this initiative is that it shifts the debate: from the need for the presence of a human to the definition of the conditions that make this intervention really effective.

In short, to talk about Human in the Loop, Human on the Loop or Human in Command is to address one of the central elements of the responsible adoption of artificial intelligence. Therefore, rather than raising a discussion between automation and human control, the real challenge is to find the right balance between the two. AI can bring efficiency, but it only generates true value if it is integrated into processes with effective oversight mechanisms. In this sense, the future does not yet seem to point to systems without people, but rather to organizations capable of excellently combining the potential of AI with the judgment and responsibility that only humans can provide at the moment.

Content prepared by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization. The content and views expressed in this publication are the sole responsibility of the author.

Ciencia y tecnología