Item request has been placed!
×
Item request cannot be made.
×
Processing Request
An effective dimensionality reduction method for text classification based on TFP-tree.
Item request has been placed!
×
Item request cannot be made.
×
Processing Request
- Author(s): Liu, Lu1,2,3
- Source:
Journal of Intelligent & Fuzzy Systems. 2018, Vol. 34 Issue 3, p1893-1905. 13p.
- Subject Terms:
- Additional Information
- Abstract:
Obtaining interesting and topic-relevant information is a very important task in Web mining. Text classification using a small proportion of labeled data and a large proportion of unlabeled data, also called semi-supervised learning, is a well-known problem. Despite plenty of research on text classification, however, how to effectively and efficiently apply valuable frequent patterns and deal with high-dimensional data in text classification is still an open issue. Due to the increasing data volumes and plenty of high-dimensional data, both distance measures and time complexity could be influenced by the noisy data. This paper targets on this problem and presents a novel method for text classification called CTFP (Classification based on TFP-tree), which uses TFP-tree (Text-Frequent-Pattern-tree) to generate frequent patterns in tremendous amount of texts and conduct text classification in a relatively low dimensional data space. It effectively reduces the data dimensionality during constructing the classifier. Substantial experiments on three datasets (RCV1, SRAA and Reuters-21578) show that our proposed method can achieve better performance than many existing state-of-the-art methods on precision, efficiency and many other evaluation metrics. [ABSTRACT FROM AUTHOR]
- Abstract:
Copyright of Journal of Intelligent & Fuzzy Systems is the property of IOS Press and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
No Comments.