Ruta de navegación

Conoce la Facultad de Informática

Conócenos

Conoce la Facultad de Informática de la UPV/EHU

El centro de referencia en la formación y conocimiento técnico/científico en informática e inteligencia artificial.

Conócenos (Abre una nueva ventana)

Localización y contacto (Abre una nueva ventana)

Aplicaciones anidadas

Destacado - MOVILIDAD

Destacado - EMPRESAS

Difusio@

28-03-2023 10:00 ; DEFENSA TESIS DOCTORAL NAIARA PEREZ MIGUEL

Imagen

Naiara Pérez Miguel"Contributions to Information Extraction for

Spanish Written Biomedical Text".

Zuzendariak_Directores: German Rigau Claramunt/ Montserrat Cuadros Oller.

2023_03_28, 10:00  Sala Ada Lovelace aretoa.

Abstract:

"Healthcare practice and clinical research produce vast amounts of digitised, unstructured data in multiple languages that are currently underexploited, despite their potential applications in improving healthcare experiences, supporting trainee education, or enabling biomedical research, for example. To automatically transform those contents into relevant, structured information, advanced Natural Language Processing (NLP) mechanisms are required. In NLP, this task is known as Information Extraction. Our work takes place within this growing field of clinical NLP for the Spanish language, as we tackle three distinct problems. First, we compare several supervised machine learning approaches to the problem of sensitive data detection and classification. Specifically, we study the different approaches and their transferability in two corpora, one synthetic and the other authentic. Second, we present and evaluate UMLSmapper, a knowledge-intensive system for biomedical term identification based on the UMLS Metathesaurus. This system recognises and codifies terms without relying on annotated data nor external Named Entity Recognition tools. Although technically naive, it performs on par with more evolved systems, and does not exhibit a considerable deviation from other approaches that rely on oracle terms. Finally, we present and exploit a new corpus of real health records manually annotated with negation and uncertainty information: NUBes. This corpus is the basis for two sets of experiments, one on cue and scope detection, and the other on assertion classification. Throughout the thesis, we apply and compare techniques of varying levels of sophistication and novelty, which reflects the rapid advancement of the field".


Contenido 7 - Sellos