10 repositories of public data related to health and wellness

Fecha de la noticia: 21-10-2021

10 repositories of public data related to health and wellness

One of the lines of action of the European Commission for the coming months is the creation of a European health data space. Following in the footsteps of Europe, the Spanish Government also hasn among his plans to launch a data lake sanitary with a large amount of raw data available to researchers and administrations, among other groups.

The interest of governments to promote the openness of data in this sector is not accidental. The health and wellness data they are essential for improving health care, research and health policy development. In addition, access to this type of data allows the implementation of solutions based on innovative technologies, such as artificial intelligence, that transform health systems, promoting improvements in the health and quality of life of all citizens.

Although it is common to find data of this type in general repositories (for example, in datos.gob.es there are currently more than 16,000 data sets in the health and well-being categories, more and more initiatives arise, bothor private as well as public, specialized in the publication of research data, medical results or health statistics. This type of data is shared anonymously and guaranteeing the privacy of patients. Next, we collect 10 examples at the international level.

 

CDC Wonder

  • Publisher: US Centers for Disease Control and Prevention.

Users can access statistical research data published or hosted by the Center for Disease Control and Prevention through an ad-hoc query system. It also offers reference materials, reports, and guidelines on topics related to health and epidemiological research.

Among others, data sets for public use can be consulted on mortality, cancer incidence, HIV and AIDS, tuberculosis, vaccines, birth rates, census data, etc. The requested data is easily summarized and displayed, with dynamically calculated statistics, graphs and maps. These data are available for download. CDC Wonder also has a API for automated data queries in XML format.

World Health Organization

  • Publisher: World Health Organization

The World Health Organization (WHO) has among its objectives to encourage states to collect, manage, analyze and use health data from both the population (household surveys, civil registration systems of vital events, etc.) and institutional (administrative and operational activities of institutions, such as health centers). In this sense, it offers a series of data collection and analysis tools, What SCORE, a package of tools, resources, methodologies and harmonized interventions to strengthen the health data of each country.

On its website, WHO offers centralized access to various data collections on diseases such as tuberculosis, or related topics such as food safety, which can be downloaded in CSV format. Also offers visualizations and a series of dashboards to easily bring citizens data on coronavirus, monitoring the WHO's work o the differences between countries regarding the mortality.

HealthData.gov

  • Publisher: US Government

Datasets on a wide range of topics can be found on the U.S. Government Health data portal, including environmental health, medical devices, healthcare, social services, mental health, or drug abuse substances.

The data is collected and supplied from agencies of the United States Department of Health and Human Services, as well as from specialized centers and agencies. They can be downloaded in CSV format (some are also available in RDF) or using inquiries SoQL.

Broad Institute

  • Publisher: Broad Institute

Researchers at the Broad Institute generate on the order of 20 terabytes of data streams every day. On their website they offer results of scientific and health research related to human biology, health and diseases. They also offer open source tools to work with the data.

Browsing through its various programs we can find and download data related to the Cancer -here are downloaded- or the epigenome, among others.

GDC Data Portal

  • Publisher: National Cancer Institute

This website enables targeted searches of a wide variety of publicly available cancer-related datasets. It includes more than 600,000 files relating to 85,000 cases, with information on genes and mutations.

On the web you can explore the data, view visualizations and analyze the informationthrough various tools. Users can download the information in JSON and TSV format or access them through a API.

PhysioNet

  • Publisher: Physionet

PhysioBank contains over 36,000 annotated and digitized physiological signal recordings and time series. Much of the data is free and can be downloaded in CSV, but others have restricted use.

A differential factor of PhysioNet is that it collaborates in the organization and dissemination of challenges where participants must address unresolved questions of clinical interest using the data.

It also offers a computer software collection for visualization, analysis and modeling of physiological signals and time series. It is a series of open source programs that can be studied, verified and modified to adapt them to the specific needs of each user. On their website you can find various tutorials to learn how to work with Physionet and its associated tools.

NHS Digital

  • Publisher: UK National Health Services.

NHS Digital houses the UK's health and wellness related datasets, and some globally. It includes data on expenses, waiting time, diseases or lifestyle habits (such as alcohol and drug use, or obesity). You need to register to access the information.

It also offers interactive dashboards on topics of interest such as general medicine wave mental health in England. On its website it has a developer area with information about your API.

Global Health Data Exchange (GHDx)

  • Publisher: Institute for Health Metrics and Evaluation (IHME)

The Institute for Health Metrics and Evaluation (IHME), an independent center for global health research at the University of Washington, provides comparable measures of the world's most important health problems and assesses the strategies used to address them. This information is shared openly through the GHDx portal, where users can find data sets from surveys, censuses, vital statistics, etc.

The data can be used, shared, modified or developed by users for non-commercial purposes through the attribution license Open Data Commons.

OpenNeuro

  •  Publisher: cualquier investigador que quiera abrir los datos de su investigación.

OpenNeuro is a platform designed to share data from magnetic resonance, magnetoencephalography (MEG), electroencephalograms, etc. The new material is added as researchers they open their own data.

There are currently more than 600 datasets. The datasets are publicly available for further research and better diagnostics in Brain Imaging Data Structure (BIDS) format and under a Creative Commons CC0 license.

It should be noted that OpenNeuro has integrated the data from OpenfMRI.

CMS.gob

  • Publisher: U.S. Centers for Medicare & Medicaid Services.

CMS.gob is a search engine that offers access to datasets related to the services provided by institutions that accept Medicare. Medicare is a program of social security coverage administered by the US government, which provides medical care to all people over 65 years of age or of any age with a disability or serious illness.

Through this repository, data on doctors, hospitals, facilities that offer certain services such as dialysis or rehabilitation, home care, etc. are shared. The data can be downloaded in CSV format or through your API.

 

Thanks to the data from all these repositories, analysis and research can be carried out to predict and detect diseases, as well as improve the care provided to patients.

Do you know more international repositories with health data? Leave us a comment or send us an email to dinamizacion@datos.gob.es.