From Corpus to Repository: Making the leap (and documenting it, too!)

This blog post was written by Emily Palese.

Planning: Spring 2019

In between processing students’ texts for the corpus, Hadi Banat, Hannah Gill, Dr. Shelley Staples and Emily Palese met regularly during Spring 2019 to strategize about expanding Crow’s repository. At the time, the repository had 68 pedagogical materials from Purdue University, but none from the University of Arizona and no direct connections between students’ corpus texts and the repository materials.

Hadi lead our team’s exploration of how repository materials had been processed previously, including challenges they faced, solutions they found, and questions that remained unresolved. With this context, we used Padlet to brainstorm how we might classify new materials and what information we’d like to collect from instructors when instructors share their pedagogical materials.

*A section of our collaborative Padlet mindmap*

Once we had a solid outline, we met with instructors and administrators from the University of Arizona’s Writing Program to pitch our ideas. Finally, with their feedback, we were able to design an online intake form with categories that would be helpful for Crow as well as instructors, administrators, and researchers.

Pilot & Processing Materials: Summer 2019

To pilot the online intake survey, we asked 8 UA instructors and administrators to let us observe and record their experiences as they uploaded their materials. This feedback helped us make some important immediate fixes and also helped us consider new approaches and modifications to the form. Another benefit of piloting the intake form is that we received additional materials that we could begin processing and adding to the repository.

Before processing any UA repository materials, Hannah and Emily first reflected on their experiences processing corpus texts and discussed the documents that had helped them navigate and manage those processing tasks. With those experiences in mind, they decided to begin two documents for their repository work: a processing guide and a corresponding task tracker.

Processing Guide: “How to Prepare Files for the Repository”

To create the processing guide, Hannah and Emily first added steps from Crow’s existing corpus guide (“How to Prepare Files for ASLW”) that would apply to repository processing. Using those steps as a backbone, they began processing a set of materials from one instructor together, taking careful notes of new steps they took and key decisions they made.

At the end of each week, they revisited these notes and discussed any lingering questions with Dr. Staples. They then added in additional explanations, details, and examples so that the process could easily and consistently be followed by other Crowbirds in the future.

The result was a 9-page processing guide with 12 discreet processing steps:

Task Tracker: “Repository Processing Checklist”

When they worked as a team processing corpus texts in Spring 2019, Hannah, Jhonatan, and Emily used a spreadsheet to track their progress and record notes about steps if needed. This was particularly helpful on days when they worked independently on tasks; the tracker helped keep all of the team members up-to-date on the team’s progress and aware of any issues that came up.

With this in mind, Hannah and Emily created a similar task tracker for their repository processing work. The tracking checklist was developed alongside the processing guide so that it would have exactly the same steps. With identical steps, someone using the tracking checklist could refer to the processing guide if they had questions about how to complete a particular step. Once a step is completed, the Crowbird who finished the step initials the box, leaves a comment if needed, and then moves to the next step.

Below is a screenshot of what part of the tracking checklist looks like.

Developing the processing guide and the corresponding checklist was an iterative process that often involved revisiting previously completed tasks to refine our approach. Eventually, though, the guide and checklist became clear, consistent, and sufficiently detailed to support processing a variety of pedagogical materials.

Success!

By the end of the summer, Hannah, Emily, and Dr. Staples successfully processed 236 new pedagogical materials from the University of Arizona and added them to Crow’s online interface.

For the first time, Crow now has direct links between students’ texts in the corpus and pedagogical materials from their classes in the repository. This linkage presents exciting new opportunities for instructors, researchers, and administrators to begin large-scale investigations of the connections between students’ drafts and corresponding instructional materials!

2 responses to “From Corpus to Repository: Making the leap (and documenting it, too!)”

Recent corpus development – Crow

December 6, 2019

[…] team members are continually improving the code we use to process and de-identify contributed texts, building documentation to describe our approaches, and hosting workshops to help team members at our Arizona and Purdue sites become better corpus […]
Crow Spotlight: Emily Palese – Crow

January 24, 2022

[…] Collaborative work has been important to Emily for a long time, and she gains team experience from her work on the Crow repository, AZTESOL, the Second Language Writing collaboratives, and […]

From Corpus to Repository: Making the leap (and documenting it, too!)

Planning: Spring 2019

Pilot & Processing Materials: Summer 2019

Processing Guide: “How to Prepare Files for the Repository”

Task Tracker: “Repository Processing Checklist”

Success!

Recent Posts

APPLAWS, Spring 2025

Crow, Class of 2025

Crow at Purdue Undergrad Research Conference

News Categories

Tags

2 responses to “From Corpus to Repository: Making the leap (and documenting it, too!)”