Abstract
Automatic International Classification of Disease (ICD) coding plays a crucial role in assigning ICD codes to electronic medical records. This task presents a challenging multi-label text classification problem due to the vast number of ICD codes and the imbalanced label distribution. However, accurately predicting all labels simultaneously is extremely difficult for such the large label space. In this paper, we propose a novel model called Deep Iterative Learning Model (DILM-ICD), which uses an iterative learning framework to perform automatic ICD coding task. The iterative learning framework can refine the prediction results by repeating the iteration modules, which simulates the human-like coding process. In addition, we propose a multi-head text-label matching mechanism, which combines the embedding ICD description information to better match the relationship between text and label. The combination of the iterative learning framework with the multi-head text-label matching mechanism enables the model pay attention to lowfrequency ICD codes. DILM-ICD is evaluated on the MIMIC-III-full dataset and MIMIC-III-50 dataset. The experimental results show that DILM-ICD achieves state-of-the-art results across multiple evaluation metrics, which demonstrates the effectiveness of our proposed model.