Klasifikasi Dokumen Skripsi Dengan Menggunakan Text Mining (Studi Kasus: Fakultas Teknologi Informasi)

Authors

  • Feri Irfanto Teknik Informatika, Fakultas Teknologi Informasi, Universitas Hasyim Asy’ari
  • Aries Dwi Indriyanti Teknik Informatika, Fakultas Teknologi Informasi, Universitas Hasyim Asy’ari
  • Dharma Bagus Pratama Putra Teknik Informatika, Fakultas Teknologi Informasi, Universitas Hasyim Asy’ari

Abstract

Thesis document classification is a data mining method with the aim of categorizing thesis abstracts whose categories are unknown. The purpose of thesis document classification aims to assist students in finding a thesis document that is in accordance with their research by reading the abstract to find out specific category. The research discussed about the application of text mining in the classification of thesis documents with case studies at the Faculty of Information Technology. Text mining is functioned to extract data in the form of text to get information from a collection of documents. In this study using the Naïve Bayes Classifier method, a classification method by calculating probability by adding frequencies with a combination of values in the data set. This method has the aim of classifying the datatesting according to the datatraining attributes. Abstract files processed in this classification are abstract files from IT Faculty students who have graduated. There are 5 categories used, namely SPK, RPL, Data Mining, Image Processing, and System and Network Security. The process of calculating the classification of the thesis document using the Naïve Bayes Classifier method begins with inputting training data, preprocessing, calculating the term frequency (word occurrences), calculating the word probability value from the training data, and the final process is calculating the maximum probability value for each category. The data used in this study were 49 data, 34 of which were used for training data and the remaining 15 were used for testing data. Of the total 15 testing data, 14 data were classified correctly and 1 sample was not classified correctly. The accuracy obtained from the thesis document classification system is 93%.

 Keywords: Thesis Document Classification, Text Mining, Naïve Bayes Classifier

Downloads

Download data is not yet available.

Published

2021-03-17