Abstract
There is currently a big demand for automating big data analysis. In the data analysis field, data abstraction or summarization playes an important role in the extraction of generalized information from large scale data. We developped an artificial intelligence computer system with the aim of automating big data analysis and came up with a method that can abstract numerical type data (age, height, time, etc.). However, it could not abstract or summarize label type data (customer ID, product code, name, etc.). In the present work, we have developed a label abstraction method based on information entropy. Experiments using open real data showed that the proposed method achieved an extraction accuracy of 80% evaluated by f measure. We intended to apply the proposed method to our artificial intelligence and perform further evaluations.