CLASSIFICATION OF TURKISH TWEETS BY DOCUMENT VECTORS AND INVESTIGATION OF THE EFFECTS OF PARAMETER CHANGES ON CLASSIFICATION SUCCESS


BİLGİN M.

SIGMA JOURNAL OF ENGINEERING AND NATURAL SCIENCES-SIGMA MUHENDISLIK VE FEN BILIMLERI DERGISI, cilt.38, sa.3, ss.1581-1592, 2020 (ESCI) identifier

Özet

Natural language processing is an artificial intelligence field which is gaining in popularity in recent years. To make an emotional deduction from texts related to an issue, or classify documents are of great importance considering the increasing data size in today's world. Understanding and interpreting written texts is a feature that pertains to people. But, it is possible to deduce from texts or classify texts using natural language processing which is a sub-branch of machine learning and artificial intelligence. In this study, both text classification was made on Turkish tweets, and text classification success of method parameter changes was investigated using two different methods of the algorithm mentioned as document vectors in the literature. It was found in the study that as well as higher accuracy values were obtained by the DBoW (Distributed Bag of Words) method than DM (Distributed Memory) method; higher accuracy values were also obtained by DBoW-NS (Negative Sampling) architecture than others.