This is the first post highlighting information from our Crow Symposium. For the second post, please navigate here!
On January 31, the annual 2026 Crow Symposium showcased two excellent ongoing research projects from Crowbirds Dr. Hadi Banat and the team of Dr. Ge Lan and Hailey Jie Yang. Alongside them, nine other Crowbirds roosted together in a Zoom nest to talk, reconnect, and, of course, share the awesome work they’ve been doing.
In this article, we discuss Dr. Ge Lan’s and Hailey Jie Yang’s research and their approach to corpus building. We also explain the multidimensional analysis approach to their research, their development of pedagogical activities, and what they will do next with their research.
Research Overview & Corpus Building
Dr. Ge Lan (principal investigator) and Hailey Jie Yang (PhD student working alongside Ge) from City University of Hong Kong shared their ongoing research project titled Grammatical Complexity in Discipline-specific Student Writing: Writing Development, Writing Quality, and Functional Description. Funded by the Research Grant Council, a public research funder in Hong Kong, this project is focused on corpus building and clarifying vague descriptions around appropriate vocabulary and grammar usage in pedagogical materials and assessment rubrics. Hailey is Ge’s research assistant who is helping him with corpus building and developing teaching materials.
Ge also collaborated with Prof. Shelley Staples (a fellow Crowbird) on the multidimensional (MD) analysis approach they are using in their research, which he describes as “a quantitative way to systematically analyze co-occurring linguistic patterns and their associated grammatical functions.” In other words, they need to analyze student texts from multiple perspectives to understand lexico-grammatical characteristics of certain writing tasks. This analysis allows them to develop targeted pedagogical materials for discipline-specific English courses and teach students the appropriate lexico-grammar usage for different writing tasks.
When it comes to corpus building, they are using the same approach as Crow, and this approach is outlined in our CIABATTA toolkit. However, they are specifically collecting student texts from four discipline-specific English courses: English for Science, English for Engineering, English for Business, and English for Humanities and Social Studies—and adding it to their learner corpus. There are also four main writing tasks (pulled from these four courses) included in their corpus: scientific reports, technical progress reports, business emails, and narrative essays. Their corpus has grown from 900 files in January of 2023 to 1,612 today, expanding their corpus to over 1 million words from their research. In the future, their main goal is to be able to add this corpus to the Crow corpus by the end of this year.
Understanding the MD Analysis Approach
Additionally, the MD analysis approach to their research was originally proposed to Ge by Prof. Douglas Biber at Northern Arizona University. Using this analysis, they were able to identify four dimensions of writing tasks (each with their own communicative functions) and their associated language features. To identify these features, Ge explained that in his MD analysis with Prof. Shelley Staples:
We have 222 lexical or grammatical features that we applied for our MD analysis. Through using a statistical method called factor analysis, we explored what features are often used together in these four writing tasks.
That is, Ge and Shelley looked at student texts (e.g., scientific reports and technical progress reports) in Ge’s corpus to identify four co-occurring patterns and their communicative functions (also called dimensions). These four dimensions can be seen in the image below.
The multidimensional nature of “appropriateness”
| Dimensions | Sample Features |
| 1) Personal narrative vs. Informationally dense summary | Positive: past tense verbs, adverb (emphatics), pronoun (it), first-person pronouns, subordinate conjunctions (causative) Negative: definite articles, attributive adjectives, prep phrase with of, prep phrase as noun modifier (other), proper nouns, premodifying nouns, “To” adverbial clauses, AWL words |
| 2) Context-based interactive persuasion vs. Compressed procedural discourse | Positive: Moving average type token ratio (50-word windows), second person pronouns/possessives, cognitive nouns, suasive verbs, present progressive verbs, place nouns Negative: concrete nouns, Of genitives as noun modifiers, quantity nouns, agentless passive verbs, nominalizations, core academic formula |
| 3) Human-focused concrete action vs. Abstract conceptual description | Positive: aspectual verbs, animate nouns, activity verbs, mental/attitudinal verb in other contexts, infinitive verb Negative: cognitive nouns, attributive adjectives, nominalizations |
| 4) Proposed future action (in relation to current situation) vs. Completed events | Positive: verbs (uninflected present, imperative & third person), conditional subordinating conjunctions, modals of prediction, verb ‘be’, verb ‘have’ Negative: GSL_K2, past tense verbs |
Slide 5 of Ge’s and Hailey’s presentation from the Crow Symposium showing the four dimensions and their features.
Looking at the image above, the “sample features” connected to each dimension are broken up as positive and negative. A student could use the positive features applied to the “Personal narrative” function to achieve this function and use the negative features to achieve the “Informationally dense summary” function. Ge and Hailey can also use these dimensions to develop more targeted pedagogical materials for the English courses they are collecting their data from. Looking at syllabi from these English courses, they can also help students achieve intended learning outcomes for these courses using their research findings. Take a look at the three intended learning outcomes from the English for Science course that their research can help students achieve:
English for Science Course Intended Learning Outcomes (CILOs)
| CILOs | DEC- A1 | DEC- A2 | DEC- A3 |
| 1) Critically evaluate scientific texts in terms of content, writer stance, reliability, and trustworthiness, and apply the knowledge generated to their own reading and writing. | X | X | |
| 2) Create, share, and discuss a multimedia scientific documentary on an authentic scientific issue, which is organized in a logical way, follows acceptable scientific conventions, and makes effective and creative use of verbal and non-verbal delivery techniques. | X | X | X |
| 3) Write a scientific report on an authentic scientific issue, making creative and effective use of appropriate scientific language, organization, and academic referencing conventions (i.e. avoiding plagiarism). | X | X | X |
| 4) Use corpus tools to explore language in use, identify common language patterns in scientific texts, and apply their observations in their own use of English for scientific purposes. | X | X | |
| 5) Use writing as a tool for lifelong learning, by monitoring and evaluating their own learning processes and the impact of their learning on their development as a member of professional scientific communities. | X |
Slide 6 of Ge’s and Hailey’s presentation from the Crow symposium showing the Course Intended Learning Outcomes for the English for Science course.
Developing Pedagogical Activities
As noted above, the MD analysis allows Ge and Hailey to create pedagogical activities for students that help them understand the linguistic and functional characteristics of the different writing tasks established in their four dimensions. In other words, these activities are focused on improving students’ writing and critical evaluation skills by raising their awareness of these linguistic and functional characteristics in different writing tasks. They shared two of these pedagogical activities with attendees.
Activity One: Color-Coding
The first activity is a color-coding activity that is used to raise student’s awareness of linguistic and functional characteristics of scientific reports. This activity is a 20-minute group activity that focuses on dimension one from the MD analysis. The first step in this activity is to introduce students to dimension one, including its language features and communicative functions, and the second step is to assign students sample texts of scientific reports. From there, students are asked to:
- color-code the sample text with features seen in dimension one
- compare the features associated with the two communicative functions (personal narrative vs informationally dense summary)
- discuss prominent features and how a communicative function is served by their co-occurring language features
The non-color-coded sample text and the color-coded sample text shows you what this first activity looks like for students:
English for Science: Scientific Research Report [Introduction]
Dim 1: Personal narratives [+] vs Informationally Dense Summary [-]
The rapid pace of urbanization and industrial growth in Hong Kong has contributed largely to the issue of proper waste disposal and management. The government’s prioritization of economic development, along with a lack of social awareness about waste issues among the community, has exacerbated this environmental concern. According to Household Waste Collection (2023), more than 6000 tons of household waste are collected every day in Hong Kong. Moreover, the amount of municipal waste, which includes commercial, domestic, and industrial waste, has shown a steady increase since 1986.
Slide 8 of Ge’s and Hailey’s presentation from the Crow Symposium showing the non-color-coded sample text from activity one.
English for Science: Scientific Research Report [Introduction]
Dim 1: Informationally Dense Summary [-]
The rapid pace of urbanization and industrial growth in Hong Kong has contributed largely to the issue of proper waste disposal and management. The government’s prioritization of economic development, along with a lack of social awareness about waste issues among the community, has exacerbated this environmental concern. According to Household Waste Collection (2023), more than 6000 tons of household waste are collected every day in Hong Kong. Moreover, the amount of municipal waste, which includes commercial, domestic, and industrial waste, has shown a steady increase since 1986.
Slide 9 of Ge’s and Hailey’s presentation from the Crow Symposium showing the color-coded sample text from activity one.
The idea with this activity is that teachers can do this color-coding activity with students. Using Hailey’s example, she explained:
The introduction section of the scientific report can be characterized by the “informationally dense summary” function. Teachers can select a sample text from the introduction part of that writing task, ask students to color-code the dimension’s features, and ask students to compare the features that are associated with the two functions (“personal narrative” versus “informationally dense summary”). Finally, teachers guide students to discuss the salient function and explain how a particular function is served by relevant co-occurring language features.
This activity benefits both students and teachers by allowing teachers to address both students’ writing and critical thinking skills.
Activity Two: Improving AI Literacy & Critical Thinking Skills
Not surprisingly, Ge and Hailey’s research can be attuned to the rise of generative artificial intelligence. At City University of Hong Kong, there has been a strict “No AI” policy in course syllabi for discipline-specific English courses since the Spring of 2025, but that is changing in the Fall of 2026. AI will be allowed, but its use will be restricted and guided by English teachers. Acknowledging this change, their second activity aims to improve students’ AI literacy and critical thinking skills when it comes to language use in scientific texts. This second group activity is a little longer (30 minutes), engaging all four dimensions from their MD analysis instead of one.
For this activity, first, teachers select poorly-written student texts (texts that would receive a C grade) from their corpus. From there, they ask students to:
- get into groups of four so each student can meticulously analyze the sample text using one of the four dimensions
- discuss the communication limitations of the text (with guidance from teachers)
- create an AI prompt to focus on these communication limitations in the text
- create an effective AI prompt that considers the four dimensions to address these communication limitations in the text (with guidance from teachers)
Put differently, teachers can closely guide students through this activity as they ask students to analyze these poor student texts, consider their discourse limitations, and improve these texts by making effective AI prompts using the four dimensions from their MD analysis.
In the example AI prompt developed by Ge and Hailey below, students can use this prompt to revise their report to get a better grade:
A prompt template:
- You are a second-year student <put a science major here> at City University of Hong Kong. Your classmate Jason LAU got a C on his scientific report. He is feeling frustrated and asks you to help him revise <put a specific section here>.
- You are very familiar with all the attached course files below, which are course syllabus, assignment instructions, and the rubric.
- Your goal is to improve <put a communicative function here> in the text by using a group of co-occurring language features <put sample features here>.
- Output the revised text.
- Justify the revision for Jason LAU with bullet points.
Slide 11 of Ge’s and Hailey’s presentation from the Crow Symposium showing a prompt template that students can use.
Below you can see an example of a poorly-written student text and its improved version that was improved using an AI prompt, like the one above, which asked AI to improve the text by using its appropriate sample features (identified in the MD analysis above) to reach the communicative functions in the writing task:
English for Science: Scientific Research Report [Methods]
Compressed procedural discourse
| Student-written text | AI-polished text |
| Before the experiment, the research team selected healthy volunteers randomly in a university. After introduction of the topic and procedure, each volunteer can decide to join or quit. In the end 10 volunteers asked to join the experiment. Before this, each of them received a Concerns Form to sign. We also informed them that they may get recored and allowed them to choose to blur their face or not record. The procedure below is in time order and heart rate measuring is based on instructions by Dianne Pickering. | Healthy volunteers were randomly selected from the university prior to the experiment. The research team then provided an overview of the study and procedures. Volunteers were informed of their right to participate or withdraw at an point. Finally, ten volunteers consented to join the experiment. Before participation, each volunteer was required to review and sign a Consent Form. Information regarding potential video or audio recording was disclosed, and volunteers were given the option to request facial blurring or cancellation of recording. The procedure was conducted as follows, in chronological order. Heart rate measurement was performed according to instructions provided by Dianne Pickering. |
Slide 12 of Ge’s and Hailey’s presentation from the Crow Symposium showing a poor sample text and an AI-polished version of this sample text.
This sample text shows the color-coded version of the sample text above. As mentioned previously in activity one, this sample text was color-coded by students (with guidance from teachers):
English for Science: Scientific Research Report [Methods]
Compressed procedural discourse
| Student-written text | AI-polished text |
| Before the experiment, the research team selected healthy volunteers randomly in a university. After introduction of the topic and procedure, each volunteer can decide to join or quit. In the end 10 volunteers asked to join the experiment. Before this, each of them received a Concerns Form to sign. We also informed them that they may get recored and allowed them to choose to blur their face or not record. The procedure below is in time order and heart rate measuring is based on instructions by Dianne Pickering. | Healthy volunteers were randomly selected from the university prior to the experiment. The research team then provided an overview of the study and procedures. Volunteers were informed of their right to participate or withdraw at an point. Finally, ten volunteers consented to join the experiment. Before participation, each volunteer was required to review and sign a Consent Form. Information regarding potential video or audio recording was disclosed, and volunteers were given the option to request facial blurring or cancellation of recording. The procedure was conducted as follows, in chronological order. Heart rate measurement was performed according to instructions provided by Dianne Pickering. |
Slide 13 of Ge’s and Hailey’s presentation from the Crow Symposium showing the color-coded version of a poor sample text and an AI-polished version of this sample text.
What’s next? More research!
Ge and Hailey will continue their research and they plan to keep integrating AI into the development of their pedagogical materials. They will interview stakeholders (English teachers, assessment coordinators, and undergraduate students) of the discipline-specific English courses at City University of Hong Kong to receive feedback on the four dimensions and sample pedagogical materials. Then they will polish their pedagogical materials according to the feedback they receive and apply these materials in the discipline-specific English Courses at City University of Hong Kong. Finally, they hope to add these pedagogical materials to our Crow platform once they make these materials as effective as possible.
We are excited to continue to hear about their research and we look forward to adding their corpus to our corpus in the near future! Stay tuned!