Corpus and Repository of Writing

We’re closing out the spring semester with another APPLAWS post, a celebration of the team’s Awards, Publications, Plans, Leadership, Achievements, Wooots, and Surprises over the past academic year. We have lots of exciting updates to share!

Hadi Banat became a PhD candidate, won a Purdue Research Foundation dissertation fellowship for 2019-2020, and with the Transculturation team won a $5,000 CILMAR grant. His chapter “Floating on Quicksand: Negotiating Academe as Muslim” in Harry Denny et al.’s Out in the Center: Public Controversies and Private Struggles published by the University Press of Colorado came out hot off the press. He has also finalized coding and analyzing the transculturation project pilot data set and mentored undergraduate researchers who joined the team. In Crow, he has been working with Shelley, Emily, Hannah and Mark on repository development and helped the grants team with writing the ACLS grant.

Bradley Dilger worked extensively with Crow undergraduate researchers to continue spotlighting Crowbirds on our website, to build our inventory of Crow swag (STICKERS!!!!!), and to help Crow develop its outreach strategy. With Michelle McMullin, he is continuing our “Constructive distributed work” project, and is also helping our team update its environmental scans of other corpora and repositories. Bradley also taught Empirical Research in Writing Studies in Spring 2019, and helped the Transculturation team (including Crowbird Hadi Banat) win a third CILMAR mini-grant.

Mark Fullmer helped launch a new release of the Crow web interface which included a substantial redesign of the search engine, changes which lay the groundwork for more advanced functionality like wildcard searches. He submitted a patent application for software that allows readers to dynamically assign the gender of personae in prepared texts, as used on . In April, he attended DrupalCon, an annual event of the open-source content management system, and his contributions to Drupal’s layout interface were referenced during multiple sessions. He is currently collaborating with developers at the University of Nebraska-Lincoln on further enhancements.

Jie Gao is a fourth year PhD candidate in Purdue SLS. She led the team that submitted a research article on citation, and also worked on a book chapter titled “L2 Speaking: Theory and Research” during the past 10 months. She is now analyzing data for her dissertation. She hopes to finish a few chapters by the end of July.

Hannah Gill is finishing up her sophomore year at the University of Arizona. This was her first semester working with Crow and she has loved it. She has spent most of her time in the lab processing student texts from the University of Arizona writing courses. In addition, she collaborated with other members of the Crow team on collecting instructional materials to the repository. She also helped in a workshop on CROW/MACAWS which focused on designing DDL activities with the help of the two interfaces. She was also admitted into her major (PPEL—philosophy, politics, economics, and law) which she will begin in the fall semester.

Jhonatan Henao-Muñoz completed his 2nd year as a Ph.D. student this spring and will be taking his last courses on fall. This past semester he co-coordinated the 29th version of #SPGS, worked as an intern in Crow, and volunteered in the NACIL2. At the 18TH SLAT Roundtable, he presented his work-in-progress on L2 peer-editing and online translator self-editing, collaborated in a Crow/MACAWS workshop for designing DDL materials. Finally, he was admitted in the M.A. in French Linguistics and Second Language Learning, and he was awarded with an internship for NHC. Next year he will continue working in Crow and start collecting data from intermediate Spanish and French courses.

Emily Jones is wrapping up her junior year at Purdue, and it was her busiest one yet. In addition to her position with Crow, she interned with Sycamore Review, worked as Editorial Assistant for a journal under Purdue Press, and tutored in Purdue’s Writing Lab. This spring she also presented her research on gendered violence in Victorian literature, for which she received Purdue’s OUR Scholarship. Over the past year, she has done content strategy, information architecture, and branding development for Crow. Next semester she will be fulfilling her history minor while studying at Scotland’s oldest university, the University of St Andrews.

Ge Lan worked on his dissertation this past year, including completing the first draft of his literature review and part of his methodology, writing Python programs for grammatical analysis, and exploring how to use Stanford Parser with command line. He has also been working on processing Crow data that was collected in fall 2018 at Purdue, and modifying a header script developed by UA team.

Lindsey Macdonald worked on her dissertation, “The Right to Health: A Rhetorical Ecology of Mental Health Advocacy and Legislation,” and has so far completed the literature review chapter and part of the methods. She received a Graduate Summer Research Grant, so she will be spending the summer completing her data analysis and hopefully writing a chapter or two.

Michelle McMullin successfully defended her dissertation, “Crafting new materialist research frameworks for collaborative response” in April. She is ecstatic to be joining the amazing faculty at North Carolina State University as assistant professor of technical communication in the fall. She will be presenting with our Crow team and a team from MSU on Humanities Without Walls projects at Computers & Writing at Michigan State University this summer. She will also be reprising her role, this year as Dr. Hawk Girl, as director of iDTech camp at University of Michigan this summer.

Sarah Merryman worked as an undergraduate tutor in the Purdue Writing Lab, weblog and social media intern for the Purdue English Department, and assistant JTRP editor for the Purdue University Press. She won the English Department’s Outstanding Senior Award and the Albert Viton Scholarship for her work at the Press. In addition to blogging for Crow, she also helped write IRB contracts, create web content strategies, and learned the basics of Python coding. This spring, she presented her research on writing lab data usability at the Purdue Undergraduate Symposium.

Sarah proudly displays her certificate of completion for Ge Lan’s Python Coding crash course

Aleksey Novikov passed his comprehensive exams this semester, and is at the stage of making connections between data and ideas for his dissertation proposal. This semester he has mostly worked with the other Macaws birds to create pedagogical webinars on using Data-driven Learning (DDL) with learner data. He also co-presented two pedagogically-oriented workshops: Crow/MACAWS workshop for designing DDL materials, and Teaching Russian with Real World Language with existing native speaker and learner corpora.

Emily Palese passed her comprehensive exams this semester and will soon begin her dissertation proposal. This past semester she taught English 107, worked on processing UA student texts for Crow, and collaborated on collecting instructional materials for the repository. She co-presented two workshops on pedagogical approaches for supporting multilingual writers, as well as a Crow/MACAWS workshop for designing DDL materials. Next year she will continue working on Crow’s repository as a Graduate Assistant Director in the Writing Program.

Ji-young Shin defended her prospectus and finished the first draft of the literature review for her dissertation. She received two external research awards for graduate students, the 2019 AAAL Graduate Student Award and the 2019 British Council Assessment Research Award. During the fall semester, she successfully conducted two Crow workshops with other Crowbirds at the 2018 TaLC conference and the Crow Symposium. She also contributed to building the teaching material repository for Crow and participated in organizing the Crow Symposium.

Shelley Staples published two peer reviewed articles in English for Specific Purposes Journal, one a single-authored paper on using corpus-based discourse analysis to inform instruction and one with Purdue grads and a soon-to-be grad on complexity in oral language assessment. She also published a chapter on Corpus Linguistics for the Handbook of SLA and Pragmatics and a chapter on conducting Multi-dimensional Analysis in an edited volume. She submitted five additional papers and two grants (results pending). She was an invited speaker at Lancaster University, Universidad de Sonora, Vanderbilt University, and Purdue, where she gave talks on corpus linguistics and also introduced students and faculty to the Crow interface. She took over the editorship of Brief Reports with TESOL Quarterly. With Crow, Dr. Staples led our “citation project” team to their article submission, the UA team in growing our corpus (processing texts from Spring 2018-Fall 2018), and the Repository team on exciting new developments including our new intake form. She also co-led a workshop at the SLAT Roundtable and worked with Adriana, Randi, Ge, and Aleks on writing up research from their AACL presentation. With MACAWS, Crow’s cousin, she led the team in their production of a series of webinars. Finally, she helped 5 PhD students reach the final lap in their careers as students, including Crowbirds Ashley J Velázquez and Aleksandra Swatek, and two Crowbirds (Emily Palese and Aleks Novikov) reach their exciting next stage in their PhD process.

David Stucker, a junior in Purdue University’s Professional Writing program, joined the Crow undergraduate researcher team in early February. He spent the semester developing corpus backend bug report documentation and environmental scan criteria, proposed corpus user-agreement considerations, and performed environmental scans of similar corpora. He intends to continue his work with Crow over the summer and the upcoming fall semester.

Aleksandra Swatek defended her PhD dissertation, “The language of engagement in math instructional video tutorials: A corpus-based study.” She also taught face-to-face courses (OEPP) and online courses (ICaP) at Purdue. She presented initial results of her dissertation research at the Purdue Linguistics, Literature, and Second Language Studies Conference. She is currently on the job market in Poland.

Ashley Velázquez successfully defended her dissertation, “What’s the ‘problem’ statement? An investigation of problem-based writing in a First Year Engineering program” in April. She is thrilled to be joining the faculty at the University of Washington-Bothell as an assistant professor in the School of Interdisciplinary Arts & Sciences in Fall 2019. Dr. Velázquez was also selected to serve on TESOL’s Standards Professional Council this past fall for the next two years. This summer, before leaving for Washington State, she’ll be leading a workshop or two on how to use Crow and develop DDL materials for teaching second language writing at Wright State University.

Participants work with Crow researcher Novikov at the SLAT roundtable

This semester, Arizona Crowbirds along with representatives from MACAWS, our new Multilingual Academic Corpus of Assignments: Writing and Speech, received the opportunity to present at the SLAT Roundtable. Our presenters were Aleksey Novikov, Emily Palese, Jhonatan Henao-Muñoz, Dr. Shelley Staples, and Hannah Gill. At the presentation, we introduced the two corpora (Crow and MACAWS) and the basic premise of Data-Driven Learning (DDL). With DDL, students and instructors use a hands-on approach to examine authentic corpus data to discover language patterns that can then be used to create lessons, activities, and instructional materials.

Since one of our main goals was to give participants concrete ideas about incorporating material from the corpus into their classroom settings, we gave examples of how Crow and MACAWS could be used in the foundations writing classroom (Crow) and in Russian language classes (MACAWS). Participants were then given the opportunity to split into groups and focus on creating activities tailored to the two corpora. For Crow, we used our online interface, released in October 2018. For MACAWS, we used a sample of off-line texts with the freeware program AntConc. The participants, most of whom were instructors in either the Russian department (MACAWS) or in the Writing Program (Crow), were given the chance to ask questions, voice concerns, and work closely with various features of the two corpora to explore how the corpora could be used to design their own activities, lessons, and instructional materials.

Crow researchers Hannah Gill, Aleksy Novikov, Jhonatan Henao-Muñoz, Shelley Staples, and Emily Palese (left to right). 

We ended by sharing the next steps for both Crow and MACAWS development. For Crow, this includes an expansion of the repository and improved capabilities for intake of pedagogical materials from instructors, which we plan to launch in Fall 2019. For MACAWS, this includes a planned beta release of its interface (built using the same front-end as Crow) for August 2019. We were also able to get feedback on the Crow interface about what was useful and possibilities for improvement. Since the presentation, we have discussed ways in which we can translate the advice and participant input into changes to the Crow interface.

Here are slides and the materials that we used in the presentation.

Follow us for updates on Twitter! @writecroworg

On Friday April 5th, Arizona Crowbirds hosted a “Launch Lunch” as a way to announce changes and developments to the Crow website, as well as a way to thank instructors and administrators for their support and feedback. With the new changes to the Crow interface, instructors will now be able to request full text access. Furthermore, there have been improvements based on past workshops and feedback such as the ability to get dynamic frequency data from filters (e.g., assignment or student’s country of origin) rather than from the entire corpus.

At the “Launch Lunch,” Dr. Staples gave a demonstration of the interface and then instructors were able to explore on their own and offer suggestions as they came up. We also announced our plans for the addition of repository materials from the University of Arizona, and our new intake form that will streamline the collection process We’ll pilot the form this summer and release an updated version this fall. Dr. Staples also briefly displayed the mock-up for the online version of MACAWS, which will be launched in Fall 2019. Since the lunch, our developer Mark Fullmer has already made changes to the site, including highlighted search terms in the full text, to make it as user-friendly as possible.

Sarah Merryman is a senior at Purdue University majoring in Professional Writing and minoring in Communications. At the invitation of Crow PI Bradley Dilger, Sarah started working with Crow as a project intern and wrote a series of blogs for its 2018 spring methodology workshop, her first venture into blogging. After becoming a full-time undergraduate researcher in the fall of 2018, her role expanded into social media promotion, IRB drafting, and creating content strategies.

These tasks challenged her to learn a new set of communication and writing skills. Because Crow is a multi-institutional team, she often conducted meetings and blog interviews through digital mediums like Google Hangouts. Navigating Crow’s organization platform, Basecamp, and learning how to pair-write articles with fellow Crowbirds helped her better understand the importance of sustainable collaboration in the workforce. Likewise, helping draft IRB proposals and contracts gave her a glimpse at the steps researchers take to launch their projects. On the flip side of the research equation, Sarah had the privilege of listening to linguistic scholars from various post-secondary institutions present their research findings at Crow’s 2018 Writing Research Without Walls symposium. Witnessing the internal process and public-facing product of linguistic research, inspired her to consider a research-oriented career sometime in the future.

Sarah proudly displays her certificate for completing the introductory Python coding course with teacher, and fellow Crowbird Ge Lan.
Sarah proudly displays her certificate for completing the introductory Python coding course with teacher, and fellow Crowbird Ge Lan.

However, collaboration and scholarly research were not the only areas of Crow she found both challenging and rewarding. Sarah completed a beginners course in Python coding taught by fellow Crow member Ge Lan. After years of considering the difficulty of computer coding on par with learning ancient Sanskrit backwards, Sarah was surprised to discover she enjoyed coding, and hopes to continue learning it in her spare time after graduation.

Her favorite part of being a Crowbird is the freedom to try new experiences. Unlike the repetitive, coffee-fetching experience she envisioned to be the rite-of-passage for interns everywhere, working with Crow allowed her to integrate her personal goals with Crow objectives. At the start of each semester, she met with PI Bradley Dilger and together they brainstormed a list of skills she wanted to develop. They then created a workflow that would allow her to work toward these professional goals. Sarah credits Crow with giving her the knowledge and experience to thrive in today’s workforce, where content strategy and the ability to collaborate with peers from different backgrounds and geographic distances is key.

Outside of Crow, Sarah has held a variety of positions at Purdue. Ever drawn to the publishing world, she has been a reporter for The Purdue Exponent and a member of the Journal of Purdue Undergraduate Research Student Editorial Board. She has worked at the Purdue University Press since 2017, first as the Administrative and Marketing Intern and then as the Assistant Editor for the Joint Transportation Research Program. As Assistant Editor, she edits and facilitates the publication of JTRP reports, which are downloaded and used worldwide. Always interested in trying out things that have never been done before, Sarah also served as the first undergraduate blog coordinator and social media intern for the Purdue English Department. She is finishing her time at Purdue as an undergraduate tutor in the Purdue Writing Lab.

Passionate about usability and UX design, Sarah conducted two research projects: one on the usability of writing center usage data, and another on a redesign of the PASE Mock Career Fair. However, her most memorable research experience was investigating the experiential design of the Purdue Farmers’ Market. What started as an in-class assignment somehow turned into a friendship with one of the farmers and a part-time job flipping burgers at his market booth. Who says research is all done in a lab?

Following her graduation in May, Sarah hopes to pursue a position in scholarly publishing. However, she also plans to spend some time enjoying the freedom of not having homework and to continue her education informally through hobbies. She wants to sharpen her social media skills, learn professional photography, and to travel. If she is feeling particularly ambitious, Sarah might even pursue a more health-conscious lifestyle. After her surprisingly pleasant experience learning Python, nothing seems too unusual to try – not even exercise.

Crowbird Adriana Picoral is a prime example of taking an interdisciplinary approach to academic research. Passionate about computer coding since the age of nine, Adriana always knew she wanted to be a computer scientist. Unfortunately, with female Computer Science students outnumbered by a ratio of 1 to 15 at her university (Federal University of Rio Grande do Sul, in Brazil), Adriana’s presence in a STEM-focused major was constantly called into question. Jokingly, she credits her eventual interest in linguistics research to “running away from computer science because they were mean.” In reality, Adriana’s undergraduate thesis on developing a computer game to teach Portuguese to non-native adults is what sparked her interest in language learning.

Adriana’s research process has come a long way since her undergraduate thesis, but one key element has remained the same: a focus on interdisciplinary methods and tools to understand language acquisition. Her research analyzes the intersection of corpus linguistics, computational linguistics, and foreign language acquisition. For her dissertation, Adriana is researching how different factors affect third-language acquisition in adult learners. Specifically, she is looking at Spanish-English bilingual adults, and investigating how their native language affects their ability to learn Portuguese. She uses mixed methods by creating a corpora of Portuguese, English, and Spanish texts and then applying computational linguistics methods to analyze the language behavior.

Graphic used in Adriana's dissertation on copula verbs to adverbs
Preference of ESTAR copula use with intensifiers across different corpora for Adriana’s dissertation

But as much as she enjoys research, Adriana isn’t ruling out the possibility of working in industry instead of academia. In her internship with the Educational Testing Services (ETS), Adriana discovered how valuable an interdisciplinary researcher is in an industry already saturated with specialized employees. This became further evident in her 2018 internship with Google, where there was an abundance of linguists and software engineers, but not many employees who could do both, like Adriana.

After taking a corpus-linguistics class taught by Crow co-founder Shelley Staples, Adriana became a Crowbird in the fall of 2016. Since then, she has put her computer skills to work by standardizing the text format of Crow’s collected materials. Using her experience in coding, she worked with Shelley to create a system that converts all documents to normalized UTF-8 text files. This also labels the words in the texts for speech tags, such as verbs or nouns, for future language analyses. Adriana also created Python scripts to ensure repository materials are tagged and encoded accurately, and JavaScript web-interfaces to assist in manually coding students’ texts for a number of things, such as citation practices.

Aside from her work doing text nominalization, Adriana has also participated in multiple Crow workshops. In July 2018, she helped lead the debut of the Crow web interface in a 3-hour workshop at the Teaching and Language Corpora (TaLC) conference in Cambridge, England. That same year, she presented a comparative analysis of various linguistic tagging tools at the 14th American Association of Corpus Linguistics conference and a workshop on the citation practices of L2 writers at the American Association for Applied Linguistics (AAAL) conference.

Adriana with colleagues at AAAL 2019
Adriana (back, second from right) with colleagues at AAAL 2019

Moving forward, Adriana is interested in taking Crow’s research on citation a step farther by incorporating the computational methods she used in her dissertation into Crow. She intends to create machine learning models to classify new data. She is excited to work on a project that unites Crow work with her dissertation research. The ability to incorporate different interdisciplinary approaches into her work is Adriana’s favorite part about Crow.

We look forward to seeing how Adriana will continue to improve our interface and promote interdisciplinary research methods.

Aleksandra Swatek

“That’s the beauty of doing research: You do one small thing…and it grows to be something bigger,” says 5th year PhD candidate Aleksandra Swatek. This is certainly true, although one could hardly describe Aleksandra’s research as “small.”  Her dissertation seeks to analyze the language of engagement in online instructional videos, specifically math lectures from both Khan Academy and MIT. To do this, she has created a corpus of lecture transcripts from each source—both of which total about 1.5 million words.

Aleksandra’s research is uniquely positioned at the intersection of Second Language Studies and Corpus Linguistics, and she draws on methodologies from the latter in a variety of ways. For example, after assembling her data set, she used Sketchengine to analyze and compare the language used in the two corpora. She has already noted differences in the type and frequency of personal pronouns (we, I, you), stance markers (specifically modal verbs), and hypothetical reported speech (imagining how a student might respond). She hopes that the results of her research will help instructors better use language to engage online students, especially as traditional classroom settings transition into online spaces.

Chart showing the personal pronoun frequency within math lectures of Khan Academy and MIT. Khan used significantly more "we" pronouns and MIT used significantly more "I" pronouns. The use of "you" was roughly the same for both.

Aleksandra’s interest in corpus linguistics made her a perfect fit for Crow even before it existed. Initially, she worked with former Purdue professor Dr. Shelley Staples on the Purdue Second Language Writing Corpus, from which Crow eventually emerged. To date, she has been involved in a variety of Crow projects and conferences, including our recent presentation at the Teaching and Language Corpora (TaLC) conference where she helped to debut the Crow platform and collect feedback on our online interface. In collaboration with other Crow members, Aleksandra used our platform to research reporting verbs in student writing. She isn’t slowing down any time soon, either; a new project on formulaic language is currently in the works.

Aleksandra’s familiarity with corpora allows her to see and appreciate just what makes Crow unique: an eagerness to share and make data accessible. These attributes make Crow the only active, open-access corpus and repository of academic materials in the world. Aleksandra is excited to be part of a project that will benefit the greater community, particularly those conducting research on student writing. Going forward, she plans to continue doing research and finding ways to bridge the gap between science and humanities. In fact, this is something that occupies Aleksandra’s mind even in her rare free moments. What started as a hobby has turned into a project on the relationships between writing studies communities encompassing rhetoric and composition; second language writing; and technical communication and EAP. She is also an avid Starcraft 2 e-sports fan and a gamer herself.

Whether she’s working on her dissertation, analyzing the Crow corpus, or mulling over the role of humanities in the world, there is one thing we know for sure: Crow is lucky to have someone as dedicated to and passionate about accessible data on our team.

Photo credit: Zhaozhe Wang

Crow researcher Shelley Staples has been invited to speak as part of a workshop hosted by The Centre for Corpus Approaches to Social Science at Lancaster University. Staples’s talk, “Using multidimensional analysis for language assessment,” will be at 13.00 UK time (8:00am US/Eastern). We’ve reprinted the event announcement below.

Update 3/07: Video of the event is available. 

Corpus-based approaches to language testing

The ESRC Centre for Corpus Approaches to Social Science (CASS), Lancaster University is organising a free half-day workshop on corpus-based approaches to language testing. The event offers a combination of two lectures and a practical session. The practical session focuses on major corpus techniques used in language assessment research and practice. The workshop is suitable for students, researchers and practitioners interested in language assessment, applied linguistics and corpus methods. No prior knowledge of corpus linguistics is required. We are delighted that Dr Shelley Staples from the University of Arizona accepted the invitation to give a guest lecture at the event.

Follow the event on Twitter from 12.00 UK time.
Event link:

Research, teaching, and podcasts, oh my! If three words could summarize Crowbird Michelle McMullin’s time at Purdue, those would be it. Michelle is studying Rhetoric and Composition, with a focus on technical communication. She is interested in the ways public policy and public rhetoric inform each other. Her dissertation explores the HIV outbreak in southeast Indiana and how this public health crisis motivated state legislature to implement needle exchange programs.

Graphic detailing Michelle's dissertation on public policy after HIV outbreak in Indiana.
Michelle McMullin’s dissertation explores the intersection of public rhetoric and policy.

A lover of infrastructure, Michelle tries her best to reserve Thursdays and Fridays for writing her dissertation, and dedicates the rest of the week to other work. Last semester, she taught Business Writing and a Data Science Learning Community section of first-year writing. She has also served as a mentor for other PW instructors, and said, “My teaching gets better when I talk to others about their teaching.” This kind of collaborative conversation is Michelle’s favorite part of her job, and is why she is working toward a tenure track position at a university that values community engagement. As she said, “I think it’s important that research doesn’t just live in journals, but does real work in our communities.”

When asked about what she does in her free time, Michelle laughed and said she likes “to sleep and sometimes do laundry.” With her busy schedule, most of Michelle’s “free” time is taken up with research and listening to podcasts—two concepts that have more in common than one might think. Several of Michelle’s favorite podcasts are created by the McElroy family. Fans of their shows, such as My Brother and Me and The Adventure Zone (TAZ), recently managed to raise more than $48,000 in 72 hours in support of the Boys and Girls Club of West Virginia. The organization lost funding because they were serving LGBTQ+ kids.

Ever the academic, Michelle couldn’t help but analyze the experience through the lens of her research, saying, “as somebody interested in community building and public problem solving, this was fascinating.” She also noticed that audience members of the actual play podcast TAZ engage with the creators to inform the writing of the show. Based on audience interaction, the McElroy family continues to challenge hetero-normative gender representations and include queer characters in their storytelling. Michelle’s observations of this collaborative storytelling led her to a collaboration with fellow academic and TAZ fan Lee Hibbard on an article about queer representation in an actual play podcast community.

In December of 2015, Michelle became one of the first members of Crow, specializing in creating and sustaining Crow’s infrastructure. She recently finished a best practices article with Crow team leader Bradley Dilger. This isn’t the first project they’ve collaborated on; in fact, their overlapping interest in technical writing is what originally prompted Michelle to join the Crow team. In this ongoing study of our best practices, we are using data about our distributed work to evaluate and improve our collaborative methods. Michelle is leading this effort, developing a research design based on data we can collect from Basecamp, Google Drive and GitHub. From this data, supplemented with team interviews and discussions, we are identifying trends in participation, leadership, and efficiency. This mixed methods approach is allowing us to identify the best practices needed to make Crow sustainable.

Crow co-founder Shelley Staples said it best: “I have seen firsthand the application of Michelle’s research methods and the framework she uses in the work she is doing for Crow. She has been instrumental in mapping the networks of collaboration and communication that we use as part of our complex, interdisciplinary and inter-institutional team, and has identified specific aspects of our own technical communication that are effective and ineffective, leading to concrete changes within our team’s practices.”

Michelle has returned as a full-time Crowbird this spring, on research assistantship to help us wrap our Humanities Without Walls grant and kickstart outreach now that we have the Crow platform up and running. She is eager to continue her infrastructural work and begin mentoring undergraduate researchers from a new position.

Purdue Crowbird Ashley Velázquez is a fifth-year PhD candidate researching L1 and L2 engineering students’ writing practices. She is currently finishing her dissertation, “What’s the “Problem’ Statement? An Investigation of Problem-based Writing in First-Year Engineering (FYE),” on fellowship from the American Association of University Women. The project analyzes how linguistically diverse students in Purdue’s FYE program complete written tasks, particularly problem statements. Her research has involved building a corpus of student texts, analyzing pedagogical materials, and conducting interviews with FYE faculty.

Ashley’s involvement with Crow started as a complete (but happy) accident. Her mentor, Shelley Staples—one of the founding members of the Crow team—invited her to a Crow meeting, as the topic was relevant to a reporting verb project they were collaborating on. Three years later and Ashley still hasn’t left the “meeting,” having assumed multiple leadership roles in our Crow team.  As a Crow Graduate Lab Practicum RA, she helped with corpus building, participant recruitment, mentoring, and more. This past year kept her especially busy: Ashley was writing grants, scheduling our 2018 Writing Research Without Walls symposium, and co-authoring a study on reporting verbs for L2 Journal.

In October 2017, Ashley was one of several Purdue Crowbirds able to travel to Tucson, Arizona for a project summit funded by our Humanities Without Walls grant. On the way to the airport, Bradley Dilger was emailed a grant opportunity. “We can do that,” Ashley said. And before the plane left, the team had an outline, a draft abstract, and a Basecamp buildout underway. “It’s amazing how much Ashley has grown as a grant writer,” said Dilger. “Her work on the narrative of our recent ACLS application was stellar, and we’re glad we can count on her for help with our next grants.”

Crow isn’t the only commitment on Ashley’s plate. Throughout her time at Purdue, she has taught first-year writing courses, developed her dissertation and other publications, worked for the OEPP and OWL, and served as a Mechanical Engineering Writing Enhancement Coordinator. For the latter, she developed rubrics, hosted writing workshops for students, adapted pedagogical materials, and assessed student writing skills. She also helped international teaching assistants effectively evaluate their students’ writing.

Following the completion of her dissertation, Ashley hopes to obtain a tenure track position in either applied linguistics or rhetoric and composition. After five years in the arctic tundra that is Indiana, she’s ready to move somewhere warm with her two Hobbit cats, Merry and Pippin. Her job search extends “coast to coast and off the coast,” ranging from the East Coast to Hawaii.

Now that her reporting verbs article is published, Ashley is brainstorming her next Crow project: an empirical research study on the use of collocations (which are groups of words commonly grouped, such as “potential solutions”) by L1 and L2 students. She is also busy working on a translingual writing piece and an engineering research project with the University of Arizona. Luckily for us, she plans to continue working with Crow at her next institution. We can’t wait to see where that will be!

Our Crow team reached a new milestone at the 14th American Association for Corpus Linguistics (AACL) this past September: our first presentations of inter-institutional projects! The two presentations, “Annotating learner data for lexico-grammatical patterns: A comparison of software tools” and “Lexico-grammatical Patterns in First Year Writing across L1 Backgrounds,” were given by Crowbirds from the University of Arizona, Purdue University, and Northern Arizona University.

Adriana Picoral leading PowerPoint presentation in front of classroom of researchers.

Adriana Picoral leading the first presentation, “Annotating learner data for lexico-grammatical patterns: A comparison of software tools.”

The first project, “Annotating learner data for lexico-grammatical patterns: A comparison of software tools” was led by Adriana Picoral. The team, consisting of Adriana Picoral, Dr. Randi Reppen, Dr. Shelley Staples, Ge Lan, and Aleksey Novikov, compared three tools: 1) Biber tagger, a POS and syntactic tagger that integrate rule-based and probabilistic components; 2) MALT parser, an open source statistical dependency parser; and 3) Stanford parser, another open source statistical parser widely used in natural language processing applications. The corpus for this study was sampled from our larger inter-institutional corpus of first year writing (FYW) texts, and consisted of a total of 16 documents from 3 institutions (Purdue University, University of Arizona, and Northern Arizona University) and 4 first language backgrounds (Arabic, Chinese, English, and Korean) for a total of 27,930 tokens.

All documents were annotated using all three tools. Gold standard labels were also created by up to four human coders for each document. Predicted labels from the three tools were then compared with the human-created gold standard labels. Precision (when a word was annotated, if it was correct) and recall (whether the annotation was identified on a word) were calculated for each one of our target features (noun-noun sequences, attributive adjectives, relative clauses, and complement clauses) across the different tools. The team presented methods, including descriptions of the web-based interfaces built for human tag-checking, and the evaluation measures from all three tools. While the Stanford parser performed better when labeling our target clausal features, the Biber tagger performed better for the targeted phrasal features.

Post-processing scripts will be used to improve both tools’ accuracy, and the team may combine their output to achieve higher performance rates on automated annotation of our learner data in the future.

The second project presentation, “Lexico-grammatical Patterns in First Year Writing across L1 Backgrounds,” was led by Dr. Shelley Staples with help from other Crowbirds Dr. Randi Reppen, Aleksey Novikov, and Ge Lan, and including collaborators Dr. Qiandi Liu and Dr. Chris Holcomb from University of South Carolina. The group compiled a balanced corpus (612,100 words) of argumentative essays across four L1s – English, Chinese, Arabic, and Korean, which was then tagged with the Biber Tagger and improved for accuracy with post-tagging scripts. The researchers investigated the use of six features: attributive adjectives, pre-modifying nouns, that- and wh-relative clauses, that- verb complement clauses, and that- noun complement clauses both quantitatively and qualitatively. ANOVA was applied to test the differences among the four L1 groups and across two different institutions (Northern Arizona University and University of South Carolina).

The results showed significant differences in the way the four features were used across the four L1 groups (p < 0.05), particularly attributive adjectives, premodifying nouns, that- noun complement clauses, that- and wh- relative clauses. Compared to L1 English writers, L2 writers tended to rely more on the repetition of phrasal features. They also used more wh-relative clauses than that- relative clauses, which could be explained by more prescriptive instruction on wh- relative clauses for L2 writers, as opposed to the influence of oral language and a lack of register awareness for L1 English writers.

Finally, attributive adjectives and that- relative clauses had significant differences for Chinese L2 writers (p < 0.05), whereas no significant difference was found for any feature between the two institutions for L1 English writers. A possible reason for this difference is that students from USC who used more of the two features, may have had higher proficiency, but the NAU students were in a bridge program working on improving their proficiency. An alternative explanation is that relative clauses were included in the USC syllabi, while it is unclear whether this instruction was received at NAU.

Both conference presentations were very well received at AACL. We plan to submit a publication for the first paper to NAACL or ACL in the near future.

Tagged with: ,