Our Crow team reached a new milestone at the 14th American Association for Corpus Linguistics (AACL) this past September: our first presentations of inter-institutional projects!
The two presentations, “Annotating learner data for lexico-grammatical patterns: A comparison of software tools” and “Lexico-grammatical Patterns in First Year Writing across L1 Backgrounds”, were given by Crowbirds from the University of Arizona, Purdue University, and Northern Arizona University.
The first project, “Annotating learner data for lexico-grammatical patterns: A comparison of software tools” was led by Adriana Picoral. The team, consisting of Adriana Picoral, Dr. Randi Reppen, Dr. Shelley Staples, Ge Lan, and Aleksey Novikov, compared three tools: 1) Biber tagger, a POS and syntactic tagger that integrate rule-based and probabilistic components; 2) MALT parser, an open source statistical dependency parser; and 3) Stanford parser, another open source statistical parser widely used in natural language processing applications. The corpus for this study was sampled from our larger inter-institutional corpus of first year writing (FYW) texts, and consisted of a total of 16 documents from 3 institutions (Purdue University, University of Arizona, and Northern Arizona University) and 4 first language backgrounds (Arabic, Chinese, English, and Korean) for a total of 27,930 tokens.
All documents were annotated using all three tools. Gold standard labels were also created by up to four human coders for each document. Predicted labels from the three tools were then compared with the human-created gold standard labels. Precision (when a word was annotated, if it was correct) and recall (whether the annotation was identified on a word) were calculated for each one of our target features (noun-noun sequences, attributive adjectives, relative clauses, and complement clauses) across the different tools. The team presented methods, including descriptions of the web-based interfaces built for human tag-checking, and the evaluation measures from all three tools. While the Stanford parser performed better when labeling our target clausal features, the Biber tagger performed better for the targeted phrasal features.
Post-processing scripts will be used to improve both tools’ accuracy, and the team may combine their output to achieve higher performance rates on automated annotation of our learner data in the future.
The second project presentation, “Lexico-grammatical Patterns in First Year Writing across L1 Backgrounds,” was led by Dr. Shelley Staples with help from other Crowbirds Dr. Randi Reppen, Aleksey Novikov, and Ge Lan, and including collaborators Dr. Qiandi Liu and Dr. Chris Holcomb from University of South Carolina. The group compiled a balanced corpus (612,100 words) of argumentative essays across four L1s – English, Chinese, Arabic, and Korean, which was then tagged with the Biber Tagger and improved for accuracy with post-tagging scripts. The researchers investigated the use of six features: attributive adjectives, pre-modifying nouns, that- and wh-relative clauses, that- verb complement clauses, and that- noun complement clauses both quantitatively and qualitatively. ANOVA was applied to test the differences among the four L1 groups and across two different institutions (Northern Arizona University and University of South Carolina).
The results showed significant differences in the way the four features were used across the four L1 groups (p < 0.05), particularly attributive adjectives, premodifying nouns, that- noun complement clauses, that- and wh- relative clauses. Compared to L1 English writers, L2 writers tended to rely more on the repetition of phrasal features. They also used more wh-relative clauses than that- relative clauses, which could be explained by more prescriptive instruction on wh- relative clauses for L2 writers, as opposed to the influence of oral language and a lack of register awareness for L1 English writers.
Finally, attributive adjectives and that- relative clauses had significant differences for Chinese L2 writers (p < 0.05), whereas no significant difference was found for any feature between the two institutions for L1 English writers. A possible reason for this difference is that students from USC who used more of the two features, may have had higher proficiency, but the NAU students were in a bridge program working on improving their proficiency. An alternative explanation is that relative clauses were included in the USC syllabi, while it is unclear whether this instruction was received at NAU.
Both conference presentations were very well received at AACL. We plan to submit a publication for the first paper to NAACL or ACL in the near future.