Software and Corpus Downloads

Xiaofei Lu

D-Level Analyzer

D-Level Analyzer is an automatic syntactic complexity analyzer based on the revised Developmental Level scale (Rosenberg & Abbeduto 1987; Covington et al. 2006). This analyzer assigns each sentence in an input text to one of eight developmental levels, depending on the structure of the sentence.

L2 Syntactic Complexity Analyzer

L2 Syntactic Complexity Analyzer is designed to automate syntactic complexity analysis of written English language samples produced by advanced learners of English using fourteen different measures proposed in the second language development literature. The analyzer takes a written English language sample in plain text format as input and generates 14 indices of syntactic complexity of the sample.

Lexical Complexity Analyzer

Lexical Complexity Analyzer is designed to automate lexical complexity analysis of English language samples using 25 different measures of lexial density, lexical sophistication, and lexical variation proposed in the first and second language acquisition literature. The analyzer takes an English text in plain text format as input and generates 25 indices of lexical complexity of the sample.

Eng-Editor and Chi-Editor

Eng-Editor (英语阅读分级指难针) (see Jin & Lu, 2018) is an online text evaluation and adaptation system that matches English reading texts to specific proficiency levels specified in the Chinese EFL curriculum standards and that annotates texts with a number of lexical and syntactic features to inform text adaptation. The system references vocabulary lists from the Chinese EFL curriculum standards and employs the L2 Syntactic Complexity Analyzer for syntactic complexity analysis. A corpus of around 7,000 text samples representing a range of Chinese EFL learners’ proficiency levels is used to provide benchmarks for text complexity evaluation and lexical and syntactic annotation.

Chi-Editor (汉语阅读分级指难针) (see Bo, Chen, Guo & Jin, 2019 in Lu & Chen, 2019) is an online text evaluation and adaptation system that matches Chinese reading texts to specific proficiency levels specified in the International Curriculum for Chinese Language Education (Confucius Institute Headquarters, 2015) and that annotates texts with a number of lexical and syntactic features to inform text adaptation. The system references vocabulary lists from national Chinese as second language (CSL) curriculum standards. A corpus of approximately 550 widely-used CSL textbooks is used to provide benchmarks for text complexity evaluation and lexical and syntactic annotation.

PSU Chinese Metaphor Corpus

The PSU Chinese Metaphor Corpus consists of a subset of text samples from the Lancaster Corpus of Mandarin Chinese annotated for metaphor-related words following MIPVU.

A Phrase-Frame List for Social Science Research Article Introductions

This is a list of phrase-frames derived from the introduction sections of the research articles included in the Corpus of Social Science Research Articles (COSSRA).