Ulster University Logo

Ulster Institutional Repository

Text classification using word sequence kernel methods.

Biomedical Sciences Research Institute Computer Science Research Institute Environmental Sciences Research Institute Nanotechnology & Advanced Materials Research Institute

Trindade, Luis, Wang, H., Blackburn, William and Rooney, Niall (2011) Text classification using word sequence kernel methods. In: International Conference on Machine Learning and Cybernetics, ICML 2011, Guilin, China. IEEE. 6 pp. [Conference contribution]

Full text not available from this repository.

URL: http://doi.ieeecomputersociety.org/10.1109/ICMLC.2011.6016983

DOI: 10.1109/ICMLC.2011.6016983

Abstract

This paper presents a comparison study of two sequence kernels for text classification, namely, all common subsequences and sequence kernel. We consider some variations of the two kernels - kernels based on individual features, linear combination of individual kernels and kernels with a factored representation of features - and evaluate them in text classification by employing them as similarity functions in a support vector machine. A sentence is represented as a sequence of words along with their lemma and part-of-speech tags. Experiments show that sequence kernel has a clear advantage over all common subsequences. Since the main difference between the two kernels lies in the fact that the frequency of words (objects) is considered in sequence kernel but not in all common subsequences, we conclude that the frequency of words is an important factor in the successful application of kernels to text classification.

Item Type:Conference contribution (Paper)
Faculties and Schools:Faculty of Computing & Engineering
Faculty of Computing & Engineering > School of Computing and Mathematics
Research Institutes and Groups:Computer Science Research Institute
Computer Science Research Institute > Artificial Intelligence and Applications
ID Code:20144
Deposited By:Dr Niall Rooney
Deposited On:29 Sep 2011 09:22
Last Modified:29 Sep 2011 09:22

Repository Staff Only: item control page