CLARA-MeD simplified sentences

Name: CLARA-MeD simplified sentences
Creator: Agencia Estatal Consejo Superior de Investigaciones Científicas
License: https://creativecommons.org/licenses/by-nc-sa/4.0/
Keywords: None

Publicador Agencia Estatal Consejo Superior de Investigaciones Científicas

Nivel de administración Administración del Estado

Entidad

Pública

Licencia

https://creativecommons.org/licenses/by-nc-sa/4.0/

Descripción

This dataset contains 1200 manually simplified sentences (144 019 tokens) from clinical trials in Spanish. A total of 1040 announcements from the European Clinical Trials Register (EudraCT) were analyzed to select sentences with ambiguities or exceeding 25 words. Simplification criteria were devised in an annotation guideline, which is released publicly along the dataset. This resource was collected in the CLARA-MeD project, with the goal of simplifying medical texts in the Spanish language and reduce the language barrier to patient's informed decision making. In particular, the project aims at developing linguistic resources for automatic medical term simplification in Spanish; and conducting experiments in automatic text simplification.

Datos

Información

Show technical data sheet of the dataset.

Ficha técnica

Distribuciones(3)

Identificación Interoperabilidad

Identificador	https://digital.csic.es/bitstream/10261/346579/1/claramed_synt_simp_aligned.tsv
URL del punto de acceso	https://digital.csic.es/bitstream/10261/346579/1/claramed_synt_simp_aligned.tsv

Formato	TSV
Tamaño	981.11 KB

Identificación Interoperabilidad

URL del punto de acceso	https://digital.csic.es/bitstream/10261/346579/2/CLARA-MeD_simplif_guideline.pdf

Formato	PDF
Tamaño	757.92 KB

Identificación Interoperabilidad

URL del punto de acceso	https://digital.csic.es/bitstream/10261/346579/6/README_CLARAMED_sentences.txt

Formato	plain
Tamaño	7.32 KB

Palabras clave
Etiquetas	Frases paralelas Prcesamiento del Le... Simplificación text...
Categorías
Categorías	Ciencia y tecnología Salud
Cobertura
Cobertura geográfica	España
Cobertura temporal	Desde 8/02/2024 23:00 (UTC) hasta 8/02/2024 23:00 (UTC)
Idioma
Idiomas	Español Inglés

Identificación
Identificador	http://hdl.handle.net/10261/346579
Fecha de creación	8/02/2024 23:00 (UTC)
Referencias
Otros recursos	https://github.com/lcampillos/CLARA-MeD/ http://hdl.handle.net/10261/269887 http://hdl.handle.net/10261/359759 http://hdl.handle.net/10261/359770

Idioma

You are here

CLARA-MeD simplified sentences