Empirical evaluation of three machine learning method for automatic classification of neoplastic diagnoses [Evaluación empírica de tres métodos de aprendizaje automático para clasificar automáticamente diagnósticos de neoplasias]

January 2011

Abstract

Diagnoses are a valuable source of information for evaluating a health system. However, they are not used extensively by information systems because diagnoses are normally written in natural language. This work empirically evaluates three machine learning methods to automatically assign codes from the International Classification of Diseases (10th Revision) to 3,335 distinct diagnoses of neoplasms obtained from UMLS®. This evaluation is conducted on three different types of preprocessing. The results are encouraging: a well-known rule induction method and maximum entropy models achieve 90% accuracy in a balanced cross-validation experiment.

Type

Journal article

Publication

Ingeniare

Empirical evaluation of three machine learning method for automatic classification of neoplastic diagnoses [Evaluación empírica de tres métodos de aprendizaje automático para clasificar automáticamente diagnósticos de neoplasias]

Abstract

José Luis Jara

Associate Professor

Max Chacón

Full Professor