Morzsák

Oldal címe

Assessing Conventional and Deep Learning-Based Approaches for Named Entity Recognition in Unstructured Hungarian Medical Reports

Címlapos tartalom

In digital healthcare, much patient data is available in text format. The structuring of this data, according to standards, has yet to be widely used, including in Hungary, where it is available in unstructured form. To make these patient records easy to filter and search, they must be processed and structured. Using modern natural language processing and deep learning techniques has resulted in effective systems for implementing such workflows. However, selecting appropriate algorithms for specific text-processing tasks is still a challenging issue. This is due to the scarcity of benchmarks and the variety of architectures available. This article evaluates models for named entity recognition (NER) in digital medical reports written in Hungarian. We evaluate traditional, recurrent neural network, and transformer-based approaches for NER using a dataset comprising 801 positron emission tomography scans and annotated medical reports. The medical reports were annotated to cover six different entity classes and reviewed by clinical experts to ensure accuracy. We present a comprehensive assessment of various methods and provide insight into addressing NER problems in the case of low-resource languages such as Hungarian.