Crow, the Corpus & Repository of Writing, is a web-based archive which supports research and professional development in applied linguistics and rhetoric & composition. Our project began in Fall 2015 at Purdue University and has expanded to many more institutions, including Arizona, Washington-Bothell, NC State, Northern Arizona, and UMass-Boston.

Crow is supported by an ACLS Digital Extension Grant from the American Council of Learned Societies (ACLS).

Our project has also been supported by the Humanities Without Walls consortium, based at the Illinois Program for Research in the Humanities at the University of Illinois at Urbana-Champaign. The Humanities Without Walls consortium is funded by a grant from the Andrew W. Mellon Foundation.

About the Crow corpus

Our corpus holds over 10,000 texts produced by undergraduate students in first year writing. Students represented in the corpus come from over 50 different countries and are majoring across over 100 programs.

Screenshot of Crow web interface:

Information on the the context in which each text was produced include:

  • year and semester the text was written
  • course for which the text was written
  • assignment of each text (e.g., argumentative paper, genre analysis, literature review)
  • draft of each text
  • relevant demographic information of the students (gender, country of origin, program and major, TOEFL score, etc.)

In addition to our corpus, our repository holds 380 instructional materials linked to the texts represented in the corpus, including:

  • In-class activity handouts, homework assignments, and peer review activities
  • Assignment sheets/prompts for major assignments
  • Lesson plans and presentation slides
  • Grading rubrics (scales) for major assignments
  • Sample papers used during instruction
  • Course syllabi, schedules and policies

Demographic data included in Crow’s corpus come from institutional data where participants are enrolled and/or employed. We recognize that these institutions’ current decisions to categorize gender as binary and country of origin as singular do not fully reflect the complexity of our participants’ identities and backgrounds, and we support advocacy for more inclusive and nuanced institutional data.

To request access to the Crow corpus & repository, review our terms and conditions, then complete the access request form. Please allow five business days for review.

Download a subset of the texts available in the online corpus. This option is intended for users who would like to use the corpus for their own research, particularly if they Note: additional training and verification is required.

For researchers who would like to annotate the corpus, use concordancing software (e.g., Antconc or LancsBox), or create their own programs to analyze the data, we have prepared an offline version. This subset of the Crow corpus has been curated by the Crow team to ensure a representative sample from the first year writing context. Additional training is required for offline use.

Citing the Crow corpus

Using our corpus? Cite us:

Staples, S., & Dilger, B. (2018-). Corpus and repository of writing [Learner corpus articulated with repository]. Available at

Our goals

Crow is an interdisciplinary and inter-institutional team dedicated to studying and teaching writing. Our project seeks to reach four primary audiences: instructors, researchers, participants, and collaborators.

  1. Maintain and continually improve our corpus and repository, prioritizing accessibility and usability by listening carefully to our community of users.
  2. Develop outreach programs for a variety of domestic and international educational institutions, meaningfully including historically marginalized students, teachers, and scholars.
  3. Advocate for an inclusive view of language that moves beyond deficit models, welcomes vernacular varieties, values learner corpora, acts upon professional standards, and respects the work of multilingual writers and learners.
  4. Develop and share more sustainable, ethical and effective approaches to building interdisciplinary research teams.
  5. Build well-documented approaches, methods, and open source tools, supporting multiple approaches to research, teaching, and professional development.
  6. Conduct writing research using our approaches, methods, and tools, and share it broadly.
  7. Seek internal and external grant funding to support research, development, and outreach.
  8. Evaluate our work systematically, including the needs of historically marginalized students, teachers, and scholars.


May 2019: Diversifying Digital Writing Archive to Include Spanish Heritage Speakers, Arizona SBS press release about our 2019 ACLS Digital Extension Grant.

June 2019: CUES Fellows to Explore Ways to Transform Teaching and Learning, Arizona press release describing Shelley Staples’s selction as a CUES fellow.


We hope you find everything on this site works well for you. If not, please let us know and we’ll fix the problem and/or provide the information you need in an alternative format.


Unless otherwise noted, site contents are copyright © 2016, 2017, 2018, 2019, 2020, and 2021 by Shelley Staples and Bradley Dilger. All content published on our weblog is copyrighted by the author or authors as credited in the byline.

All content is published with the Creative Commons Attribution 4.0 International license, meaning that you are free to reuse it as long as you provide credit to Crow and any individual authors named. We ask that you also link to the original page where you found the content.