Corpus and Repository of Writing

Emily Palese has been a leader of the Crow repository since 2019 and earned her PhD in Second Language Acquisition and Teaching from the University of Arizona in May of 2021. Emily previously studied Spanish and Anthropology at the University of Wisconsin-Madison as an undergraduate and earned her MA in Teaching English as a Second Language from the University of Arizona. As a member of the Peace Corps for two years, Dr. Palese taught English in the Philippines at a rural high school, and she also facilitated training workshops for elementary and high school teachers from other parts of the country. Here at Crow, Emily has contributed to a diverse array of projects. Collaborative work has been important to Emily for a long time, and she gains team experience from her work on the Crow repository, AZTESOL, the Second Language Writing collaboratives, and WriPACA.

Emily’s dissertation, Prompting students to write: Designing and using second language writing assignment prompts, for her recent PhD investigates how assignment prompts function in first-year writing courses at the University of Arizona. Dr. Palese’s motivation for this project came from her own experience, “When I was new to the university, I struggled to understand as a young professional, what are the expectations?” Emily immediately realized, “having a framework for analyzing prompts and [being able to] compare what you’re doing is really helpful when you’re designing new materials,” and she began her research of prompt interaction by collecting and reviewing prompts for Crow’s repository.

When Dr. Palese brought her research to her 18 student participants, she studied “how students are interacting with the materials, what they’re skipping when they’re reading, [and] what they think is important.” Emily conducted “think aloud” interviews and described her process, “As the students interacted with the prompts for the first time, I screen recorded with audio to see how they navigated [the prompts and] what their thoughts and reactions were as they looked at them. Immediately after, I had a semi-structured interview with each of the students to follow up on what they valued and how they used the prompts.” Additionally, Dr. Palese studied the rhetorical moves that occur in assignment prompts to understand how instructors give directions. Her analysis of writing is complemented by interviews of six instructors and observations of their courses. 

After earning her PhD in 2021, Emily became the Assistant Director of Global Foundations Writing at the University of Arizona, where she “provides instructional support for global micro-campuses, including onboarding and supporting instructors, developing materials, and assessing and adapting curricula.” Here at Crow, Dr. Palese finalized her work on the repository team and began preparing to transition to her new leadership position. Currently, Emily is enjoying exploring her new role and reflects, “I’m happy that the repository has new leadership and members so our original ideas and protocol can get refined with new perspectives.” 

We wish Dr. Palese well with all of her 2022 endeavors!

Crow receives significant interest from students and faculty in building their own corpora. Many people interested in corpus building are unsure where to start. How can data be organized effectively? How can participants be contacted and treated ethically? The Crow team hopes to  answer these questions by providing a Corpus in a Box: Automated Tools, Tutorials and Advising, or CIABATTA. The December 6th, 2021 CIABATTA launch introduced the “Corpus in a Box” to an international audience with participants from Lebanon, Colombia, Hong Kong, Italy, Greece, Saudi Arabia, United Kingdom, Canada, Ghana, Brazil, United States, and Poland.

The launch event began with Dr. Shelley Staples describing the content included in CIABATTA and the motivation behind the development of the corpus building process. While we created CIABATTA to help scholars begin their own corpora, Staples pointed out it is important to recognize that if you need to build a corpus, “It’s a lot of work!” If you decide building a corpus would be helpful to your research, CIABATTA has put together a start up process for anyone looking to build their own corpus. 

Building CIABATTA has allowed the Crow team to pool our experiences and contribute programming, using automated tools, and user experience guidelines. However, coding experience and research experience is not necessary to use CIABATTA. As Staples described it, CIABATTA is designed for students and faculty around the world: “from novice users looking to begin conducting data analysis through their corpus to experienced programmers ready to streamline their own processes for corpus building,” as the CIABATTA web page notes. 

CIABATTA includes several main goals: 

  1. best practices for corpus building
  2. ethical issues in corpus building
  3. checking consents and collecting data
  4. organizing your data
  5. converting, encoding, and standardizing your data
  6. organizing, preparing & processing metadata
  7. adding headers and changing filenames
  8. deidentifying your data

Attendees of the launch presented a variety of motivations for using CIABATTA, with several participants asking about using CIABATTA in academic courses and piloting CIABATTA in different languages. We encourage these uses and supported these goals in the Q&A section of the launch:

Screenshot of CIABATTA launch, showing "CIABATTA content" with list of the nine sections of content: (1) best practices for corpus building; (2) CIABATTA overview; (3) ethical issues in corpus building; (4) checking consents and collecting data; (5) organizing your data; (6) converting, encoding, and standardizing your data; (7) organizing, preparing & processing metadata; (8) adding headers and changing filenames; and (9) deidentifying your data.
The nine sections of CIABATTA content (also in the list above)

In building CIABATTA, we chose GitHub as the presentation platform because of its ability to integrate code and text from the GitHub wiki. Through GitHub, users are directly linked to the most recent data code and automated tools. In response to one participant’s question, “Could you convert CIABATTA into a textbook?” Staples and Dr. Adriana Picoral encouraged using CIABATTA or other Crow information to share with a class. 

One attendee asked if CIABATTA could help build corpora in languages other than English. The answer is yes! The Crow team has successfully piloted the Corpus in a Box in Portuguese and Russian through the Multilingual Academic Corpus of Assignments: Writing & Speech (MACAWS), and encouraged attendees interested in piloting other languages to work with Crow to offer feedback

Another important question in the Q&A section asked about CIABATTA as opposed to other programs, such as Lancsbox. Crow Team member Dr. Aleksandra Swatek answered the comparison by noting, “Lancsbox is more to analyze the corpus … CIABATTA helps to compile the corpus and all the other steps you need to prepare your files.”

In the CIABATTA Open House on December 7, 2021, ACLS program officer Dr. John Paul Christy asked about ethical concerns in corpus building, pointing out that the public turn in ACLS work has highlighted issues about the co-creation of knowledge. We shared some experiences across Crow. Dr. Bradley Dilger described his decision to defer recruiting corpus participants while he was an administrator at Purdue. Dr. Staples described our original plans for building the repository, which included posting identified materials as a way to recognize and potentially reward instructors for their participation. However, we realized doing so could result in identification of students through triangulation. This is one reason we sponsored the Crow Writing Contest at Arizona — to recognize our students’ good work without identifying their contributions to the corpus. 

Our next steps for CIABATTA include user experience testing with targeted groups such as the Crow Fellows, users of Crow, and developers and researchers using the Crow code on GitHub. If you use CIABATTA, we’d love to hear from you! Join our mailing list to stay up to date and offer your feedback, if you wish.

If you are interested in CIABATTA and were unable to attend the Launch or Open House, additional CIABATTA information can be found on CIABATTA’s GitHub and the Crow YouTube channel. We welcome your questions about CIABATTA. Just send us a note to

The fall leaves are almost done changing colors and the Crow team is hard at work! This semester, Crow was awarded the Covid-19 Research Disruption Grant. This funding was provided by the Offices of the Executive Vice President for Research and Partnerships (EVPRP) and the Provost in light of the Covid-19 pandemic’s impacts on the Crow lab. We are grateful to receive this funding, which will go a long way towards supporting our project outcomes. To get back to work and back on track, the Crow lab is hitting the ground running and welcoming three new undergraduate researchers!

Crow’s undergraduate team: from left, Professor Bradley Dilger, Ryan Day, Anna Shura, Abby Elkin, and Hannah Brostrom

Hannah Brostrom is a sophomore at Purdue University studying Professional Writing and Computer Science. She is interested in consumer technology and user experience research, and hopes to be a technology journalist in the future. She is involved with several creative writing publications, and in her spare time likes to play guitar and cook with her roommates.

Abby Elkin is a senior at Purdue University studying Professional Writing with a minor in Women’s, Gender and Sexuality Studies, and she has an Associate of Science degree from Ivy Tech. In the future she would love to write books (and would get assistance from Translation services Adelaide to reach to people with her writings) while traveling. In her spare time, she enjoys playing with her dogs, playing video games, and crocheting.

Anna Shura is a sophomore at Purdue University studying Professional Writing and Creative Writing with a minor in Global Liberal Arts Studies. In the future, she hopes to study abroad in the United Kingdom and work in the publishing industry or creative marketing. On campus, she is involved in several English organizations including serving as the President of the Professional Writing Association. In her free time, she enjoys playing violin in Purdue’s Symphony Orchestra, cooking and baking, and crafting.

My name is Ryan Day, and I am a senior in Civil Engineering and Political Science with a minor in Spanish. I have served in cross disciplinary research groups while at Purdue, including in the Building Water Systems group and Transculturation group. I am also currently a writing tutor with the Purdue Writing Lab and President of Purdue Science Olympiad. In the future, I plan to attend law school and continue combining my humanities and hard science experiences in a legal profession. I spend my spare time skiing, scuba diving, and traveling with friends and family.

As a three year veteran of the Crow team, I will serve as the mentoring undergraduate researcher, supervising and training the new team members. This is a big step forward for me as a member of the team, and I’m excited to take on this new challenge. I hope to translate my previous extracurricular and co-curricular leadership experience into a new research setting. Having been through the onboarding process before, 

Between myself and the new undergraduate researchers, we aim to undertake a number of the original project’s outcomes:

  • First, we intend to recruit instructor participants from high schools and community colleges by increasing outreach activities and extending the Crow platform. As undergraduate members of the team, we have a unique and particularly valuable role to play in outreach by increasing Crow’s visibility. 
  • We also intend to gather user experience data and direct feedback from Crow platform users. By analyzing this feedback, we will be able to better shape user interfaces, documentation, and supporting content. 
  • As our team continues to grow, we also plan to hold team meetings across all Crow institutions, to shape project direction, especially outreach and community engagement. Our role will be helping to organize and facilitate these “Crow Summits.”

We are also well underway with updating the Crow web site to document our work and attract a larger community of Crow users who use the platform for both research and professional development. Articles like this one are a big part of advertising and detailing the activities of the Crow team and developing this community relationship. Soon, we’ll start publishing some work from Hannah, Abby, and Anna, too.

We are grateful to EVPRP for their support of the Crow team and look forward to sharing a summary of our work with them in April 2022.

The Crow team is thrilled to announce the release of Corpus in a Box: Automated Tools, Tutorials, & Advising (CIABATTA). We invite you to attend our launch event and open house!

At these events, will introduce our CIABATTA toolkit for building corpora, which has been developed collaboratively by Crow researchers across our network of partner institutions. We will briefly introduce how and why we built this toolkit, and then take questions from the audience on their specific questions related to corpus building.

All are welcome, both individuals who can program using Python, and others who have little or no programming knowledge but are interested in building and working with corpora.

We invite you to join us at our release event and/or open house. Please note that registration is required. Use the links below to register. If you have any problems, please contact us.

Launch: December 6, 2021, 10am to 11am (Arizona Time/MST)

Find the launch time in your time zone: when-is-ciabatta-launch
Register to attend the launch

Open House: December 7, 2021, 8am to 9am (Arizona Time/MST)

Find the open house in your time zone: when-is-ciabatta-open-house
Register to attend the open house

The development of CIABATTA has been supported by an ACLS Digital Extension Grant from the American Council of Learned Societies (ACLS). We are grateful to ACLS, Humanities Without Walls, and our other funders for their support.

We hope you can attend. Questions? Contact

Downloadable flyer for event

On October 23rd, 2021, we were excited to host an online workshop at the Arizona Teachers of English to Speakers of Other Languages (AZTESOL) 2021 conference. The goal of our workshop, “Exploring tense-agreement issues in L2 writing using a learner corpus,” was to introduce the Crow platform and show how to use concordance lines to help students identify and understand tense-agreement patterns. Our team consisted of Ph.D. student Anh Dang, Ph.D. student Hui Wang, and Ph.D. candidate Ali Yaylali. 

Screenshot of slide containing the text: "Exploring tense-agreement issues in L2 writing using a learner corpus. Crow, the Corpus & Repository of Writing. AZTESOL State Conference 2021. October 22–23, 2021."
First slide from AZTESOL 2021 workshop 

What we shared at AZTESOL 2021

During the workshop, attendees were introduced to Crow learner corpora and Data-Driven Learning (DDL) by reviewing authentic sentence samples and grammatical forms from students’ texts. After the introduction, we guided attendees through an interactive corpus-based activity that contained three parts: 

1) Noticing verb tenses in learner writing.

In this section, participants read a list of sentences selected from Crow corpus, and identified the tense-agreement patterns by answering the guiding questions.

2) Searching the concordance lines. 

In this activity, participants looked at some concordance lines from Crow corpus, and answered the questions regarding the patterns and different tenses.

3) Independent practice.

We provided two options in the last part. Participants can either revise tense-agreement issues in an excerpt from Crow corpus or revise the issues in their own paper. They can make a decision based on their own teaching context.

During the activity, attendees were invited to use the embedded scrollable concordance lines to observe keywords and tense pattern variations. We then guided workshop participants to try the independent practice: finding and revising the tense agreement issues in the authentic excerpt. 

Screen shot of concordance lines showing a query for "what," with about 20 lines of text showing that key word in context.
Example of concordance lines used for the corpus-based activity

After sharing the activity demo, we provided some questions for the participants to discuss how they can adapt and implement this activity to fit their own instructional context and student needs. Some of the participants mentioned they needed to have more scaffolding activities in K-12 context. We were excited to hear their feedback on the activity design and valuable ideas on the activity application.


After our workshop, participants were invited  to:

  1. Download and print out our activity handout to implement this activity in their future teaching;
  2. Use the Crow platform to explore the linguistic features of students writing;
  3. Develop effective activities based on available data and information from our platform;
  4. Guide students to raise awareness and accuracy by using authentic language samples. 

Workshop materials  

We’ve included the materials we presented here. 

Thank you for your interest! We also thank all participants and organizers for their support. We look forward to attending AZTESOL next year.

This week Dr. Michelle McMullin and I were invited to speak in Dr. Beth Towle’s professional writing class at Salisbury University. One of the good things about more video-conferencing: easy to be a guest speaker in a class!

Our talk shared the Crow model for grant writing, described how we use it for professional development, and proposed three practices for lo-fi team building. Here’s the slide deck we shared.

This talk was based in part on the materials we shared with the Arizona Women’s Hackathon, especially the “consecutive agenda” Crow uses for agenda-setting and note-taking in meetings.

This post will be updated.

Thanks to funding from a Purdue Covid-19 Disruption Grant, we are able to hire four undergraduate researchers in the AY21-22 academic year! We are grateful for Purdue’s support of the Crow project. Funding will also allow us to continue working with our developer Mark Fullmer to improve the Crow platform and develop new trajectories for research.

After a training period, these researchers will complete work that helps our project make up for time lost due to Covid-19 restrictions and complications. Undergraduate research assistants will be mentored by experienced undergraduate and graduate researchers and given clear deliverable guidelines, deadlines, and expectations. They will perform the work specified by the grant:

  1. Process and de-identify recently gathered writing research data; 
  2. Interview users from the Crow community and compile information to guide Crow developers; 
  3. Review the Crow Fellows initiative in collaboration with Crow researchers guiding that project;
  4. Create and update a series of web pages documenting recent Crow work, supporting future grant writing, participant recruitment, and outreach efforts. 

Senior Crow researchers will collaborate with the undergraduate researchers to identify which of these tasks offer the best professional growth, and will seek to identify others that develop new skills and experience, such as grant writing, research design, or data collection. 

Research assistants can expect to average 10 hours of work per week with an hourly pay rate of $11. There is also a possibility for earning credit hours for internship or research related coursework. Six weeks of Fall 2021 work will focus on onboarding and training; fourteen weeks are budgeted in Spring 2022, for a total of $2,200. Pending funding, positions may be extended.

Crow is a distributed team, so quite a bit of the work will be remote. A typical week will include a few hours of regular meetings online or in our shared Heavilon 201 lab, then independent work and short online meetings as needed with other Crow researchers. Work schedules will be developed to accommodate academic and personal obligations.

Required qualifications

  • Experience with and/or interest in technical communication, applied linguistics, and diverse areas of writing studies.
  • Experience with and/or interest conducting user experience research and collaborating with software developers.
  • Experience with and/or interest in using Google Drive, Basecamp, Slack, and similar software to facilitate distributed work.
  • Ability to work both collaboratively and independently.
  • Eligibility to work as an undergraduate student at Purdue University.

To be considered, applicants must send the following materials to Bradley Dilger ( for screening and an invitation for a face-to-face interview.

  • A cover letter explaining interest in this position and describing relevant experience; 
  • An up-to-date resume; 
  • Two references who can speak to the candidate’s experience and/or potential. 

Candidates are welcome to include a PDF or web-based portfolio including samples of writing from courses, internships, and other contexts, but it is not required. 

Screening begins immediately and will conclude when positions are filled. Prospective applicants are welcome to contact Dilger with questions.

Download a printable version of this position announcement.

Building on our work from SIGDOC 2020, we’re presenting on Crow’s approaches to mentoring at SIGDOC 2021.

“Using iterative persona development to support inclusive research and assessment”
Michelle McMullin, Hadi Banat, Shelton Weech, & Bradley Dilger

As we build writing research tools, Crow researchers have always sought more ethical, sustainable approaches to collaboration. How we work is as important as what we make. In this research paper, we highlight the importance of descriptive methods such that our reflexive processes for assessment are transparent, and stay open for negotiation as we learn more, gather feedback, and apply what we learn. We share an in-depth look at Crow methods for persona development and their role in our ongoing research and assessment of Crow practices.

Here’s our video summary:

We also have a PowerPoint version including audio and speaker notes.

Along with this presentation, we have some resources for teams interested in learning more about CDW methods.

We look forward to your feedback, both for this paper, and for the resources we are building. Share your questions or ideas via this form.

We are thrilled to introduce our first cohort of Crow Fellows

  • Olayemi Awotayo, Graduate Instructor, Virginia Polytechnic Institute (Virginia Tech)
  • Dr. Madelyn Pawlowski, Assistant Professor of English, Northern Michigan University
  • Margaret Poncin Reeves, Senior Lecturer, DePaul University
  • Modupe Yusuf, Doctoral Candidate, Michigan Technological University (Michigan Tech)

Later this summer, we will share more about this outstanding group of teacher-scholars. We are grateful to everyone who applied, and especially to the American Council of Learned Societies (ACLS) for the support that makes our Fellows program possible. 

On behalf of the Crow team, I would like to take this opportunity to congratulate all of our graduates, who accomplished so much in this year that was so challenging. Six Crowbirds earned degrees from the University of Arizona: 

Anh Dang graduated with an MA in Teaching English as a Second Language (TESL) in May 2021. She will continue at Arizona as a PhD Student in Second Language Acquisition & Teaching (SLAT), with an assistantship in the UA Foundations Writing Program

Hannah Gill graduated with a double major in Philosophy, Politics, Economics and Law (PPEL) and English in May 2021. She completed an honors thesis, “English Language Learners within the Classroom: Improving K-12 Policy and Enhancing Curriculum Through Corpus Based Instruction.” In Fall 2021, she will begin a Master of Social Work (MSW) in the Mandel School of Applied Social Sciences at Case Western Reserve University. 

Jhonatan Henao-Muñoz earned an M.A. in Hispanic Linguistics, (Winter, 2020), an M.A. in French Linguistics and Second Language Teaching & Learning (Spring 2021), and a Graduate Certificate in Technology in Second Language Teaching (Spring 2021). He has accepted a position as Instructor of French at the University of Arizona.

Alantis Houpt graduated with a degree in English, and a Teaching English as a Foreign Language (TEFL) certification from the Center for English as a Second Language (CESL), in December 2020.

Dr. Aleksey Novikov earned a PhD in Second Language Acquisition and Teaching in May 2021, defending his dissertation “Syntactic and Morphological Complexity Measures as Markers of L2 Development in Russian” on May 7. 

Dr. Emily Palese also earned a PhD in Second Language Acquisition and Teaching in May 2021, defending her dissertation “Prompting Students to Write: Designing and Using Second Language Writing Assignment Prompts” on May 19.  

Kevin Sanchez graduated with a double major in English and Creative Writing in May 2021. He’s finishing up his TEFL certificate and getting ready to teach English abroad. 

Our best wishes to Anh, Hannah, Jhonatan, Atlantis, Aleksey, Emily, and Kevin! We look forward to seeing your next moves. Next week, we will have more to say about the individual accomplishments of everyone on the Crow team. 

Dr. Aleksey Novikov, Dr. Shelley Staples, and Dr. Emily Palese (left to right) at University of Arizona Commencement, May 2021