Ulster University Logo

Ulster Institutional Repository

Classification decision combination for text categorization: An experimental study

Biomedical Sciences Research Institute Computer Science Research Institute Environmental Sciences Research Institute Nanotechnology & Advanced Materials Research Institute

Bi, YX, Bell, D, Wang, H, Guo, GD and Dubitzky, Werner (2004) Classification decision combination for text categorization: An experimental study. In: DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, Zaragoza, Spain. UNSPECIFIED. 10 pp. [Conference contribution]

Full text not available from this repository.

Abstract

This study investigates the combination of four different classification methods for text categorization through experimental comparisons. These methods include the Support Vector Machine, kNN (nearest neighbours), kNN model-based approach (kNNM), and Rocchio methods. We first review these learning methods and the method for combining the classifiers, and then present some experimental results on a benchmark data collection of 20-newsgroup with an emphasis of average group performance - looking at the effectiveness of combining multiple classifiers on each category. In an attempt to see why the combination of the best and the second best classifiers can achieve better performance, we propose an empirical measure called closeness as a basis of our experiments. Based on our empirical study, we verify the hypothesis that when a classifier has the high closeness to the best classifier, their combination can achieve the better performance.

Item Type:Conference contribution (Paper)
Faculties and Schools:Faculty of Life and Health Sciences
Faculty of Computing & Engineering > School of Computing and Mathematics
Faculty of Life and Health Sciences > School of Biomedical Sciences
Research Institutes and Groups:Biomedical Sciences Research Institute
Computer Science Research Institute
Biomedical Sciences Research Institute > Molecular Medicine
Computer Science Research Institute > Artificial Intelligence and Applications
Biomedical Sciences Research Institute > Molecular Medicine > Nano Systems Biology
ID Code:4489
Deposited By:Professor Werner Dubitzky
Deposited On:05 Jan 2010 14:56
Last Modified:04 Apr 2011 15:27

Repository Staff Only: item control page