11 documents found
Practical Exercise on Multiple Linear Regression: Predict Air Quality in Castilla and León
In the field of data science, the ability to build robust predictive models is fundamental. However, a model is not just a set of algorithms; it is a tool that must be understood, validated, and ultimately useful for decision-making.
Thanks to the transparency and accessibility of open data, we…
- Data exercises
Unity Catalog: Empowering Collaboration in the Data and AI Ecosystem through Open Source
Data sharing has become a critical pillar for the advancement of analytics and knowledge exchange, both in the private and public sectors. Organizations of all sizes and industries—companies, public administrations, research institutions, developer communities, and individuals—find strong value…
- Data exercises
Learn to Generate Reports with LangGraph and AI
In the current landscape of data analysis and artificial intelligence, the automatic generation of comprehensive and coherent reports represents a significant challenge. While traditional tools allow for data visualization or generating isolated statistics, there is a need for systems that can…
- Data exercises
Guide to generating synthetic data: an indispensable tool for innovation and data protection
The Spanish Data Protection Agency has recently published the Spanish translation of the Guide on Synthetic Data Generation, originally produced by the Data Protection Authority of Singapore. This document provides technical and practical guidance for data protection officers, managers…
- Reports and studies
Chatting with Public Data: A Practical Application of Artificial Intelligence
Open data portals are an invaluable source of public information. However, extracting meaningful insights from this data can be challenging for users without advanced technical knowledge.
In this practical exercise, we will explore the development of a web application that democratizes access to…
- Data exercises
From theory to practice: creating a RAG-based conversational agent.
Introduction
In previous content, we have explored in depth the exciting world of Large Language Models (LLM) and, in particular, the Retrieval Augmented Generation (RAG) techniques that are revolutionising the way we interact with conversational agents. This exercise marks a milestone in our…
- Data exercises
Introduction to data anonymisation: Techniques and case studies
Data anonymization defines the methodology and set of best practices and techniques that reduce the risk of identifying individuals, the irreversibility of the anonymization process, and the auditing of the exploitation of anonymized data by monitoring who, when, and for what purpose they are used…
- Guides
Practical guide for the publication of linked data
It is important to publish open data following a series of guidelines that facilitate its reuse, including the use of common schemas, such as standard formats, ontologies and vocabularies. In this way, datasets published by different organizations will be more homogeneous and users will be able to…
- Guides
Practical guide for the publication of Spatial Data
A spatial data or geographical data is that which has a geographical reference associated with it, either directly, through coordinates, or indirectly, such as a postal code. Thanks to these geographical references it is possible to locate its exact location on a map. The European Union includes…
- Guides
A practical guide to publishing Open Data using APIs
An application programming interface or API is a mechanism that allows communication and information exchange between systems. Open data platforms, such as datos.gob.es, have this type of tool to interact with the information system and consult the data without the need for knowledge of the…
- Guides