The Corpus and Repository of Writing project is housed in the English Department at Purdue University and at the University of Arizona. It was initiated by Dr. Shelley Staples and Dr. Bradley Dilger in Fall 2015 and brought together 11 graduate and 2 undergraduate students from two programs, Second Language Studies and Rhetoric & Composition. The Crow team is building a web-based archive which combines a corpus of student writing from first-year composition courses at Purdue University and a repository of pedagogical artifacts such as syllabi and assignments. This unique combination will allow us to support research and professional development for the vibrant writing studies communities at Purdue, Arizona, and other partner universities.


The goals of the Crow project are to build a resource for writing research and professional development at our institutions and the larger audience of writing researchers, instructors and undergraduate students. This resource will include two main components:

  1. a corpus of first year writing and
  2. a repository of pedagogical materials used in our first year writing courses.

Over time, we will make the online platform we develop available for use inter-institutionally. Read more about our goals.


The Crow project began when Staples and Dilger realized they were independently seeking to develop online platforms to support their work at Purdue. They began discussing the possibility of a single tool which met the needs of writing teachers and scholars in both pedagogy and research. Similar projects have been established in the past at Purdue, most notably COIN, the Collaborative Online Instructor Network, developed by Alexis Ramsey-Tobienne and Kristen Seas Trader, and focusing on mentoring and community building. COIN was designed to “encourage the sharing of teaching ideas and materials among instructors of IC@P – Introductory Writing at Purdue” (COIN). Unfortunately, development of COIN was halted after a few years.


The second project, the Purdue Second Language Writing Corpus (PSLW), was envisioned by graduate students R. Scott Partridge and Heejung Kwon. Staples began building the corpus with Partridge and Kwon and helped secure resources for PSLW which would ensure its success on the long term. The PSLW data collection has been continuing since Fall 2014. The corpus includes texts written for ENGL 106i (international) sections of our first year composition courses. Students are typically asked to write three drafts of five assignments, representing different genres: writer’s autobiography, academic proposal, synthesis paper, interview report, and research paper. After we collect the texts, they go through a complex process of de-identification. Because the students’ papers often contain a lot of personal data from the students’ lives or of the people they informally interview, we have to make sure that none of that information becomes public. We use standard procedures to remove that data and include <name>, <place>, <position>, etc. tags which allow the readers to understand the gist of the sentence. We also make sure that none of the students’ identifying data are kept (their age, course number, email, etc). The de-identified texts are then stored on a local drive in txt files. For now, only graduates students and faculty from Purdue can access the corpus.


In Fall 2015, Staples and Dilger posted a call for participation, and our team began meeting regularly to discuss the initial concept for the project and shape its development over time. We are now working to re-envision these two projects as one: an online repository of authentic pedagogical artifacts such as syllabi, assignment sheets, activity sheets, and other materials used in teaching first-year writing, linked to de-identified student texts. The combination of these two types of data will support innovative research which engages not only student writing, but seeks to understand the influence of instructional materials as well.


Analyzing instructors’ and students’ texts will help us understand how first year writing is taught at Purdue University and allow us to further research initiated by other inter-institutional teams, such as the Citation Project (led by Rebecca Moore Howard and Sandra Jamieson) or the Teaching for Transfer network (coordinated by Kara Taczak, Liane Robertson, and Kathleen Blake Yancey). Indeed, the tentative name for our project at the beginning was Understanding University Writing (UUW), because such archival and research work can provide an insight into what kind of writing the students do, and how instructors support writing with prompts, activity sheets, and other materials. Though there are several existing repositories of pedagogical materials for first year writing courses, such as Iowa State’s Digital Repository for Academic Writing (DRAW) and Virginia Tech’s Outcome-Centered Electronic Library of Teaching Resources (OCELOT), these and other repositories, like COIN, have struggled to stay viable on the long term. Even fewer corpora actually (e.g. BAWE plus) add relevant pedagogical materials (prompts) to contextualize the corpus text. That is why in our project we are striving to build sustainable infrastructure, which can continue being developed even when the initial cohort of the project developers graduates.

Interdisciplinary Collaboration, Research, and Sustainability

Crow is also designed to support research and professional development projects which tap into the potential that interdisciplinary work promises. Our team includes scholars that have a variety of research interests: Writing Program Administration, writing assessment, professional writing, user experience design, sociolinguistics, and multicultural studies. We use both quantitative and qualitative research methods. The creation of Crow is especially informed by the interests of the graduate students involved.


What does interdisciplinary collaboration look like in practice? We often break up into smaller ad-hoc, interdisciplinary groups to tackle particular tasks: developing personas and environmental scans, IRB form writing, conference proposal drafting, etc. Such more focused sessions helps us also to get know each other better, as the two programs we draw from (SLS and R&C) do not usually share common classes or events on campus.


In the conversations and work we have been doing on Crow, one important term kept coming back again and again: sustainability. While looking at similar projects that are available online, we realized that some of them depended only on the work of one person with limited resources. Sometimes such projects were initiated by graduate students and would end when the students moved to other institutions.


Thus, it became clear to us how important it is to plan for sustained growth in the future. Additionally, given the response to our presentation at C&W 2016, we are currently developing an essay which explains the value of bringing corpus linguistics to the vibrant work in computational rhetoric we observed at that conference.

Future Plans: Intra and Interinstitutional Collaborations

With the transition of Dr. Staples to University of Arizona in the Fall 2016, the Crow project will gain interinstitutional character. We are also exploring ways we can collaborate with researchers in WIDE and MATRIX at Michigan State University, given their strengths in user experience design and experience architecture, as well as rhetoric and composition. In the next year, we will be building working prototypes and sharing our research at conferences and on this weblog. We hope to secure grants and other support as wel