Exploiting morphological, grammatical, and semantic correlates for improved text difficulty assessment
June 26, 2014
Conference Paper
Author:
Published in:
Proc. 9th Workshop on Innovative Use of NLP for Building Educational Applications, 26 June 2014, pp. 155-162.
R&D Area:
Summary
We present a low-resource, language-independent system for text difficulty assessment. We replicate and improve upon a baseline by Shen et al. (2013) on the Interagency Language Roundtable (ILR) scale. Our work demonstrates that the addition of morphological, information theoretic, and language modeling features to a traditional readability baseline greatly benefits our performance. We use the Margin-Infused Relaxed Algorithm and Support Vector Machines for experiments on Arabic, Dari, English, and Pashto, and provide a detailed analysis of our results.