7 posts found
GeoParquet 1.0.0: new format for more efficient access to spatial data
Cloud data storage is currently one of the fastest growing segments of enterprise software, which is facilitating the incorporation of a large number of new users into the field of analytics.
As we introduced in a previous post, a new format, Parquet, has among its…
A common language to enable interoperability between open dataset catalogs
Open data plays a relevant role in technological development for many reasons. For example, it is a fundamental component in informed decision making, in process evaluation or even in driving technological innovation. Provided they are of the highest quality, up-to-date and ethically sound, data can…
MAMD Methodology: The Alarcos Model of Data Improvement
There is such a close relationship between data management, data quality management and data governance that the terms are often used interchangeably or confused. However, there are important nuances.
The overall objective of data management is to ensure that data meets the business requirements tha…
Why should you use Parquet files if you process a lot of data?
It's been a long time since we first heard about the Apache Hadoop ecosystem for distributed data processing. Things have changed a lot since then, and we now use higher-level tools to build solutions based on big data payloads. However, it is important to highlight some best practices related to ou…
Kaggle and other alternative platforms for learning data science
The profession of the data scientist is booming. According to him 2020 LinkedIn Emerging Jobs Report, the demand for data science specialists grew 46.8% compared to the previous year, being especially demanded in sectors such as banking, telecommunications or research. The report also indicates…
Examples of uncommon open data repositories
Beyond public administrations, libraries, museums and cultural foundations data, the interest in open data knows no borders. We invite you to discover it in this post.
Normally, the concept of open data is associated with those repositories managed by public administrations, foundations and cultural…
Play to be the best with data
Publicly competing with your colleagues to solve a complex problem based on data is an irresistible motivation for some people. Almost as tempting as gaining relevance in a field of expertise as exciting and lucrative as data science.
Public competitions to solve complex problems, whose raw material…