Procesamiento del Lenguaje Natural en biomedicina CLARA-MeD corpus en Biomedical natural language processing https://digital.csic.es/bitstream/10261/269887/1/CLARA-MeD-corpus.zip 205657210 CLARA-MeD-corpus.zip https://digital.csic.es/bitstream/10261/269887/1/CLARA-MeD-corpus.zip CLARA-MeD-corpus.zip 8294 https://digital.csic.es/bitstream/10261/269887/4/README.txt README.txt https://digital.csic.es/bitstream/10261/269887/4/README.txt README.txt es http://hdl.handle.net/10261/269887 Parallel sentences CLARA-MeD corpus A collection of 24.298 pairs of professional and simplified texts (>96 million tokens): 1) Drug leaflets and summaries of product characteristics (10 211 pairs of texts, >82M words); 2) Cancer-related information summaries (201 pairs of texts, >3M tokens); and 2) Clinical trials announcements (5748 pairs of texts, 451 690 tokens). The dataset also contains a parallel corpus with a subset of 3800 sentence pairs of professional and laymen variants (149 862 tokens). This is a benchmark for medical text simplification. The latest download of files was in February 2022. Frases paralelas Comparación de corpus EA0020951 Agencia Estatal Consejo Superior de Investigaciones Científicas Comparable corpus Medical text simplification 2022-05-19T00:00:00+02:00 2022-05-19T00:00:00+02:00 2022-05-15T00:00:00+02:00 2022-05-15T00:00:00+02:00 Simplificación de textos médicos A collection of 24.298 pairs of professional and simplified texts (>96 million tokens): 1) Drug leaflets and summaries of product characteristics (10 211 pairs of texts, >82M words); 2) Cancer-related information summaries (201 pairs of texts, >3M tokens); and 2) Clinical trials announcements (5748 pairs of texts, 451 690 tokens). The dataset also contains a parallel corpus with a subset of 3800 sentence pairs of professional and laymen variants (149 862 tokens). This is a benchmark for medical text simplification. The latest download of files was in February 2022. plain text/plain application/x-zip-compressed ZIP